Abstract
To capture the higher-order autocorrelation structure for finite-range integer-valued time series of counts, and to consider the driving effect of covariates on the underlying process, this paper introduces a pth-order random coefficients mixed binomial autoregressive process with explanatory variables. The basic probabilistic and statistical properties of the model are discussed. Conditional least squares and conditional maximum likelihood estimators, as well as their asymptotic properties of the estimators are obtained. Moreover, the existence test of explanatory variables are well addressed using a Wald-type test. Forecasting problem is also considered. Finally, some numerical results of the estimators and a real data example are presented to show the performance of the proposed model.
Similar content being viewed by others
References
Al-Osh MA, Alzaid AA (1987) First-order integer-valued autoregressive (INAR(1)) process. J Time Ser Anal 8(3):261–275. https://doi.org/10.1111/j.1467-9892.1987.tb00438.x
Billingsley P (1961) Statistical inference for Markov processes. University of Chicago Press, Chicago
Brännäs K (1995) Explanatory variables in the AR(1) count data model. Umeå Econ Stud 381
Brännäs K, Nordström J (2006) Tourist accommodation effects of festivals. Tour Econ 12(2):291–302. https://doi.org/10.5367/000000006777637458
Chen CWS, Khamthong K, Lee S (2019) Markov switching integer-valued generalized auto-regressive conditional heteroscedastic models for dengue counts. J R Stat Soc Ser C Appl Stat 68(4):963–983. https://doi.org/10.1111/rssc.12344
Chen H, Li Q, Zhu F (2020) Two classes of dynamic binomial integer-valued ARCH models. Braz J Probab Stat 34:685–711. https://doi.org/10.1214/19-BJPS452
Ding X, Wang D (2016) Empirical likelihood inference for INAR(1) model with explanatory variables. J Korean Stat Soc 45(4):623–632. https://doi.org/10.1016/j.jkss.2016.05.004
Enciso-Mora V, Neal P, Rao TS (2009) Integer valued AR processes with explanatory variables. Sankhyā: The Indian J Stat; 71(2):248–263. http://www.jstor.org/stable/41343031
Freeland RK, McCabe BPM (2004) Analysis of low count time series data by Poisson autoregression. J Time Ser Anal 25(5):701–722. https://doi.org/10.1111/j.1467-9892.2004.01885.x
Freeland RK, McCabe BPM (2004) Forecasting discrete valued low count time series. Int J Forecast 20(3):427–434. https://doi.org/10.1016/S0169-2070(03)00014-1
Kang Y, Wang D, Yang K (2021) A new INAR(1) process with bounded support for counts showing equidispersion. Stat Pap 62(2):745–767. https://doi.org/10.1007/s00362-019-01111-0
Karlin S, H.E. T, (1975) A first course in stochastic processes (2nd). Academic, New York
Klimko LA, Nelson PI (1978) On conditional least squares estimation for stochastic processes. Ann Stat 6(3):629–642. https://doi.org/10.1214/aos/1176344207
McKenzie E (1985) Some simple models for discrete variate time series. J Am Water Resour Assoc 21(4):645–650. https://doi.org/10.1111/j.1752-1688.1985.tb05379.x
Möller T, Silva M, Weiß C et al (2016) Self-exciting threshold binomial autoregressive processes. Adv Stat Anal 100(4):369–400. https://doi.org/10.1007/s10182-015-0264-6
Nik S, Weiß CH (2021) Smooth-transition autoregressive models for time series of bounded counts. Stoch Model 37(4):568–588. https://doi.org/10.1080/15326349.2021.1945934
Pedeli X, Davison AC, Fokianos K (2015) Likelihood estimation for the INAR(\(p\)) model by saddlepoint approximation. J Am Stat Assoc 110(511):1229–1238. https://doi.org/10.1080/01621459.2014.983230
Ristić MM, Weiß CH, Janjić AD (2016) A binomial integer-valued ARCH model. Int J Biostat 12(2). https://doi.org/10.1515/ijb-2015-0051
Scotto MG, Weiß CH, Silva ME et al (2014) Bivariate binomial autoregressive models. J Multivar Anal 125:233–251. https://doi.org/10.1016/j.jmva.2013.12.014
Silva MED, Oliveira VL (2004) Difference equations for the higher-order moments and cumulants of the INAR(1) model. J Time Ser Anal 25(3):317–333. https://doi.org/10.1111/j.1467-9892.2004.01685.x
Steutel F, van Harn K (1979) Discrete analogues of self-decomposability and stability. Ann Probab 7(5):893–899. https://doi.org/10.1214/aop/1176994950
Wang C, Liu H, Yao JF et al (2014) Self-excited threshold poisson autoregression. J Am Stat Assoc 109(506):777–787. https://doi.org/10.1080/01621459.2013.872994
Wang D, Cui S, Cheng J et al (2021) Statistical inference for the covariates-driven binomial AR(1) process. Acta Math Appl Sin Engl Ser 37:758–772. https://doi.org/10.1007/s10255-021-1043-7
Wang X (2020) Variable selection for first-order Poisson integer-valued autoregressive model with covariables. Aust N Z J Stat 62:278–295. https://doi.org/10.1111/anzs.12295
Weiß CH (2009) Monitoring correlated processes with binomial marginals. J Appl Stat 36(4):399–414. https://doi.org/10.1080/02664760802468803
Weiß CH (2009) A new class of autoregressive models for time series of binomial counts. Commun Stat Theory Methods 38(4):447–460. https://doi.org/10.1080/03610920802233937
Weiß CH, Pollett PK (2014) Binomial autoregressive processes with density dependent thinning. J Time Ser Anal 35(2):115–132. https://doi.org/10.1002/jtsa.12054
Yang K, Wang D, Jia B et al (2018) An integer-valued threshold autoregressive process based on negative binomial thinning. Stat Pap 59(3):1131–1160. https://doi.org/10.1007/S00362-016-0808-1
Yang K, Wang D, Li H (2018) Threshold autoregression analysis for finite range time series of counts with an application on measles data. J Stat Comput Simul 88(3):597–614. https://doi.org/10.1080/00949655.2017.1400032
Yang K, Li H, Wang D et al (2021) Random coefficients integer-valued threshold autoregressive processes driven by logistic regression. AStA Adv Stat Anal 105:533–557. https://doi.org/10.1007/s10182-020-00379-0
Yang K, Yu X, Zhang Q et al (2022) On MCMC sampling in self-exciting integer-valued threshold time series models. Comput Stat Data Anal 169(107):410. https://doi.org/10.1016/j.csda.2021.107410
Yang K, Li A, Li H et al (2023) High-order self-excited threshold integer-valued autoregressive model: estimation and testing. Commun Math Stat. https://doi.org/10.1007/s40304-022-00325-3
Zhang J, Wang D, Yang K et al (2020) A multinomial autoregressive model for finite-range time series of counts. J Stat Plan Inference 207:320–343. https://doi.org/10.1016/j.jspi.2020.01.005
Zhang J, Wang J, Tai Z et al (2022) A study of binomial AR(1) process with an alternative generalized binomial thinning operator. J Korean Stat Soc 52:110–129. https://doi.org/10.1007/s42952-022-00193-1
Zhang R, Wang D (2023) A new binomial autoregressive process with explanatory variables. J Comput Appl Math 420(114):814. https://doi.org/10.1016/j.cam.2022.114814
Zhu R, Joe H (2006) Modelling count data time series with Markov processes based on binomial thinning. J Time Ser Anal 27(5):725–738. https://doi.org/10.1111/j.1467-9892.2006.00485.x
Acknowledgements
We gratefully acknowledge the associate editor, and anonymous referees for their valuable time and helpful comments that have helped improve this article substantially.
Funding
This work is supported by National Natural Science Foundation of China (No. 11901053), Natural Science Foundation of Jilin Province (Nos. YDZJ202301ZYTS393, 20230201078GX, 20220101038JC, 20210101149JC), Postdoctoral Foundation of Jilin Province (No. 2023337), Scientific Research Project of Jilin Provincial Department of Education (Nos. JJKH2022 0671KJ, JJKH20230665KJ).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: The derivations of moments
In the following, we derive the derivations of moments for the RCMBAR(p)-X model. With the representation of (2.4), the calculation of conditional expectation is trivial, thereby, omitted. We go on to derive the conditional variance.
Notice that (2.4) implies \(D_{t,i}D_{t,j}=0\) for \(i \ne j\), it follows that
Therefore, the conditional variance can be derived as follows
Appendix B: The proofs of theorems
Proof of Proposition 2.1
We should first prove the RCMBAR(p)-X process defined by (2.2) is an irreducible and aperiodic Markov chain. Without loss of generality, denote by \((\Omega _j,\mathscr {A}_j,P_j)\) the probability space of \(Z_{j,t}\). It follows by Definition 1 that \(E|Z_{j,t}|=\int _{\Omega _j}|Z_{j,t}|\textrm{d}P_j < \infty \), which implies
Notice that each term in (2.5) is strictly greater than zero, we obtain
Equation (A.1) implies the process (2.2) is an irreducible and aperiodic chain. Since the state space \(\mathbb {S}:=\left\{ 0, 1, \ldots , N\right\} \) has only a finite number of elements, \(\{X_t\}_{t\in \mathbb {Z}}\) is also a positive recurrent Markov chain and hence ergodic. Finally, Theorem 1.3 in Karlin and Taylor (1975) guarantees the existence of the stationary distribution for \(\{X_t \}\). \(\square \)
Proof of Theorem 3.1
In order to prove Theorem 3.1, we need to check all the regularity conditions of Theorems 3.1 and 3.2 in Klimko and Nelson (1978) hold. The regularity conditions for Theorem 3.1 in Klimko and Nelson (1978) are given as follows:
(i) \(\partial g/\partial \theta _i\), \(\partial ^2\,g/\partial \theta _i\partial \theta _j\), \(\partial ^3\,g/\partial \theta _i\partial \theta _j\partial \theta _k\) exist and are continuous for all \(\varvec{\theta }\in \Theta \), where \(\theta _i\), \(\theta _j\), \(\theta _k\) denote the components of \(\varvec{\theta }\), \(i,j,k\in \{1,2, \ldots ,m\}\), g is the abbreviation for \(g(\varvec{\theta },X_{t-1},\ldots ,X_{t-p},\varvec{Z}_t)\), \(m=2q+p+1\) denotes the dimension of \(\varvec{\theta }\);
(ii) For \(i,j \in \{1,2,\ldots ,m\}\), \(E|(X_1-g)\partial g/\partial \theta _i|<\infty \), \(E|(X_1-g)\partial ^2\,g/\partial \theta _i\partial \theta _j|<\infty \) and \(E|\partial g/\partial \theta _i \cdot \partial g/\partial \theta _j|<\infty \), where g and its partial derivatives are evaluated at \(\varvec{\theta }_0\) and the \(\sigma \)-filed generated by all the information before zero time;
(iii) For \(i,j,k \in \{1,2,\ldots ,m\}\) there exist functions \(H^{(0)}(X_0,\ldots ,X_{1-p})\), \(H_i^{(1)}(X_0,\ldots ,X_{1-p})\), \(H_{ij}^{(2)}(X_0,\ldots ,X_{1-p})\), \(H_{ijk}^{(3)}(X_0,\ldots ,X_{1-p})\) such that
for all \(\varvec{\theta }\in \Theta \), and
Recall that \(g(\varvec{\theta },X_{t-1},\ldots ,X_{t-p},\varvec{Z}_t)=\sum _{i=1}^p \phi _i \left( \frac{\exp (\varvec{Z}_t^{\top } \varvec{\delta }_1)}{1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_1)}X_{t-i}+ \frac{\exp (\varvec{Z}_t^{\top } \varvec{\delta }_2)}{1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_2)}(N-X_{t-i})\right) \). It is easy to check that condition (i) holds. Denote by \(p_k:=P(X_t=k)\), \(k=0,1,\ldots ,N\). Thus, we have for any fixed \(s \ge 1\),
By Definition 1, \(\varvec{Z}_t\) has a finite covariance matrix, which, together with (B.3) ensures that condition (ii) holds. Denote by \(\varvec{x}=(x_1,\ldots ,x_n)^{\top }\) a n-dimensional vector, further denote by \(\Vert \varvec{x}\Vert _1=|x_1|+|x_2|+\ldots +|x_n|\), \(\Vert \varvec{x}\Vert _{\infty }=\max _{1 \le i \le n}|x_i|\). Let
then for any \(i,j,k \in \{1,2,\ldots ,m\}\), we can verify that (B.1) holds. Moreover, \(E\Vert \varvec{Z}_t\Vert ^3 < \infty \) and (B.4) imply that (B.2) holds, which implies the \(\hat{\varvec{\theta }}_{CLS}\) is a strongly consistent estimator. With the fact that \(X_t-g\) is bounded by 2N, we obtain that \(U_t(\varvec{\theta })\) is bounded by \(4N^2\). Together with condition (ii), we have that
Therefore, the regularity conditions for Theorem 3.2 in Klimko and Nelson (1978) hold are also hold, implying the asymptotic normality for \(\hat{\varvec{\theta }}_{CLS}\). \(\square \)
Proof of Theorem 3.2
To prove Theorem 3.2, we first give an equivalent representation of the RCMBAR(p)-X process. We begin with some notations. Let \(\varvec{Y}_t:=(X_t,X_{t-1},\ldots ,X_{t-p+1})^{\top }\), \(\alpha _{i,t}=D_{t,i}\alpha _t\), \(\beta _{i,t}=D_{t,i}\beta _t\), \(i=1,2,\ldots ,p\), and further denote two pth-order matrices \(\varvec{\alpha }_t\) and \(\varvec{\beta }_t\) as
With the fact that \(0\circ X=0\) and \(1\circ X=X\) (see Lemma 1 in Silva and Oliveira 2004), model (2.2) can be written in the following form:
where \(\varvec{N}^{\top } = (N,\ldots ,N)_{1\times p}\), “\(\circ \)” operation here denotes a matrix operation which acts as the usual matrix multiplication while replacing scaler multiplication with the binomial thinning operation. Thus, we obtain a multivariate version binomial autoregressive model with state space
i.e., \(\mathbb {S}=\{(s_1,\ldots ,s_p)| s_j \in \{0, 1, \ldots , N\}, j=1,2, \ldots ,p\}\). Denote by \(P_{t|t-1}(\varvec{\theta }):=P(\varvec{Y}_t=\varvec{y}_t|\varvec{Y}_{t-1}=\varvec{y}_{t-1})\) the transition probability of \(\{\varvec{Y}_t\}\). Thus, we have
Equations (B.6) imples models (2.2) and (B.5) have the same transition probabilities, and also have the same CML estimators accordingly. Therefore, we can use the result in Billingsley (1961) to prove Theorem 3.2. To this end, we need to verify that Condition 5.1 in Billingsley (1961) holds. Condition 5.1 of Billingsley (1961) is fulfilled provided that:
1. The set D of \((\varvec{k},\varvec{l})\) such that \(P_{t|t-1}(\varvec{\theta })=P(\varvec{Y}_t=\varvec{k}|\varvec{Y}_{t-1}=\varvec{l},\varvec{Z}_t)>0\) is independent of \(\varvec{\theta }\);
2. Each \(P_{t|t-1}(\varvec{\theta })\) has continuous partial derivatives of third order throughout \(\Theta \);
3. The \(d\times r\) matrix
(d being the number of elements in D) has rank r throughout \(\Theta \), \(r:=\dim (\Theta )\).
4. For each \(\varvec{\theta }\in \Theta \) there is only one ergodic set and there no transient states.
Conditions 1 and 2 are easily fulfilled by (2.6). For any \(\varvec{\theta }\), we can select a r dimensional square matrix of rank r from the \(d\times r\) dimensional matrix (B.7), then Condition 3 is also true. Since the state space of (B.5) is a finite set, and \(P_{t|t-1}(\varvec{\theta })>0\), then Condition 4 holds. Thus, Conditions 1 to 4 are all fulfilled, which implies Condition 5.1 in Billingsley (1961) holds. Thereby, the CML-estimators \(\hat{\varvec{\theta }}_{CML}\) are strongly consistent and asymptotically normal.
The proof is complete. \(\square \)
Appendix C: the autocovariance function
Based on model (B.5), we derive the autocovariance function in the following form. Denote by \(\varvec{\gamma }(k):=\textrm{Cov}(\varvec{Y}_t,\varvec{Y}_{t-k})\). By the law of total covariance, we have
The above representation gives a measure of autocorrelation property of (B.5), which further gives a measure for the RCMBAR(p)-X process.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, H., Liu, Z., Yang, K. et al. A pth-order random coefficients mixed binomial autoregressive process with explanatory variables. Comput Stat (2023). https://doi.org/10.1007/s00180-023-01396-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00180-023-01396-8