Skip to main content
Log in

A pth-order random coefficients mixed binomial autoregressive process with explanatory variables

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

To capture the higher-order autocorrelation structure for finite-range integer-valued time series of counts, and to consider the driving effect of covariates on the underlying process, this paper introduces a pth-order random coefficients mixed binomial autoregressive process with explanatory variables. The basic probabilistic and statistical properties of the model are discussed. Conditional least squares and conditional maximum likelihood estimators, as well as their asymptotic properties of the estimators are obtained. Moreover, the existence test of explanatory variables are well addressed using a Wald-type test. Forecasting problem is also considered. Finally, some numerical results of the estimators and a real data example are presented to show the performance of the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

Download references

Acknowledgements

We gratefully acknowledge the associate editor, and anonymous referees for their valuable time and helpful comments that have helped improve this article substantially.

Funding

This work is supported by National Natural Science Foundation of China (No. 11901053), Natural Science Foundation of Jilin Province (Nos. YDZJ202301ZYTS393, 20230201078GX, 20220101038JC, 20210101149JC), Postdoctoral Foundation of Jilin Province (No. 2023337), Scientific Research Project of Jilin Provincial Department of Education (Nos. JJKH2022 0671KJ, JJKH20230665KJ).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kai Yang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: The derivations of moments

In the following, we derive the derivations of moments for the RCMBAR(p)-X model. With the representation of (2.4), the calculation of conditional expectation is trivial, thereby, omitted. We go on to derive the conditional variance.

Notice that (2.4) implies \(D_{t,i}D_{t,j}=0\) for \(i \ne j\), it follows that

$$\begin{aligned}&E(X_t^2|X_{t-1},\ldots ,X_{t-p},\varvec{Z}_t)\\&\quad =E\left( \sum _{i=1}^p D_{t,i}^2 ( \alpha _t\circ X_{t-i}+\beta _t\circ (N-X_{t-i}) )^2|X_{t-1},\ldots ,X_{t-p},\varvec{Z}_t \right) \\&\quad =\sum _{i=1}^p \phi _i E[(\alpha _t\circ X_{t-i})^2+2(\alpha _t\circ X_{t-i})(\beta _t\circ (N-X_{t-i}))+(\beta _t\circ (N-X_{t-i}))^2|X_{t-1},\ldots ,X_{t-p},\varvec{Z}_t]\\&\quad =\sum _{i=1}^p \phi _i [\alpha _t(1-\alpha _t)X_{t-i}+\alpha _t^2X_{t-i}^2 + \beta _t(1-\beta _t)(N-X_{t-i}) +\beta _t^2(N-X_{t-i})^2 + 2\alpha _t\beta _tX_{t-i}(N-X_{t-i})]\\&\quad =\sum _{i=1}^p \phi _i \left( \frac{\exp (\varvec{Z}_t^{\top } \varvec{\delta }_1)}{(1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_1))^2} X_{t-i} + \frac{\exp (2\varvec{Z}_t^{\top } \varvec{\delta }_1)}{(1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_1))^2} X_{t-i}^2 +\frac{\exp (\varvec{Z}_t^{\top } \varvec{\delta }_2)}{(1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_2))^2} (N-X_{t-i}) \right. \\&~~~~~~~~~\left. +\frac{\exp (2\varvec{Z}_t^{\top } \varvec{\delta }_1)}{(1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_1))^2} (N-X_{t-i})^2 +2\prod _{i=1}^2 \frac{\exp (\varvec{Z}_t^{\top } \varvec{\delta }_i)}{ 1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_i)} X_{t-i}(N-X_{t-i}) \right) . \end{aligned}$$

Therefore, the conditional variance can be derived as follows

$$\begin{aligned}&\textrm{Var}(X_t|X_{t-1},\ldots ,X_{t-p},\varvec{Z}_t) =E(X_t^2|X_{t-1},\ldots ,X_{t-p},\varvec{Z}_t)-E^2(X_t|X_{t-1},\ldots ,X_{t-p},\varvec{Z}_t)\\&\quad =\sum _{i=1}^p \phi _i \left( \frac{\exp (\varvec{Z}_t^{\top } \varvec{\delta }_1)}{(1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_1))^2} X_{t-i} + \frac{\exp (2\varvec{Z}_t^{\top } \varvec{\delta }_1)}{(1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_1))^2} X_{t-i}^2 +\frac{\exp (\varvec{Z}_t^{\top } \varvec{\delta }_2)}{(1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_2))^2} (N-X_{t-i}) \right. \\&\left. +\frac{\exp (2\varvec{Z}_t^{\top } \varvec{\delta }_1)}{(1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_1))^2} (N-X_{t-i})^2 +2\prod _{i=1}^2 \frac{\exp (\varvec{Z}_t^{\top } \varvec{\delta }_i)}{ 1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_i)} X_{t-i}(N-X_{t-i}) \right) \\&\quad -\left( \sum _{i=1}^p \phi _i \left( \frac{\exp (\varvec{Z}_t^{\top } \varvec{\delta }_1)}{1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_1)}X_{t-i}+ \frac{\exp (\varvec{Z}_t^{\top } \varvec{\delta }_2)}{1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_2)}(N-X_{t-i})\right) \right) ^2. \end{aligned}$$

Appendix B: The proofs of theorems

Proof of Proposition 2.1

We should first prove the RCMBAR(p)-X process defined by (2.2) is an irreducible and aperiodic Markov chain. Without loss of generality, denote by \((\Omega _j,\mathscr {A}_j,P_j)\) the probability space of \(Z_{j,t}\). It follows by Definition 1 that \(E|Z_{j,t}|=\int _{\Omega _j}|Z_{j,t}|\textrm{d}P_j < \infty \), which implies

$$\begin{aligned} I_{\varvec{Z}}(i,m,t):= \int _{\Omega _1}\ldots \int _{\Omega _q} \frac{\exp (m \varvec{Z}_t^{\top }\varvec{\delta }_1)}{(1+\exp (\varvec{Z}_t^{\top }\varvec{\delta }_1))^{x_{t-i}}} \frac{\exp ((x_t-m)\varvec{Z}_t^{\top }\varvec{\delta }_2)}{(1+\exp (\varvec{Z}_t^{\top }\varvec{\delta }_2))^{N-x_{t-i}}} \textrm{d} {Z}_{1,t} \ldots \textrm{d} Z_{q,t} <\infty . \end{aligned}$$

Notice that each term in (2.5) is strictly greater than zero, we obtain

$$\begin{aligned}&P(X_{t}=x_t|X_{t-1}=x_{t-1},\ldots ,X_{t-p}=x_{t-p})\nonumber \\&\quad =\int _{\Omega _1}\ldots \int _{\Omega _q} P(X_{t}=x_t|X_{t-1}=x_{t-1},\ldots ,X_{t-p}=x_{t-p},\varvec{Z}_t)\textrm{d} {Z}_{1,t} \ldots \textrm{d} Z_{q,t} \nonumber \\&\quad = \sum _{i=1}^p\phi _i\sum _{m=a}^{b} \left( {\begin{array}{c}x_{t-i}\\ m\end{array}}\right) \left( {\begin{array}{c}N-x_{t-i}\\ x_t-m\end{array}}\right) I_{\varvec{Z}}{(i,m,t)} >0. \end{aligned}$$
(A.1)

Equation (A.1) implies the process (2.2) is an irreducible and aperiodic chain. Since the state space \(\mathbb {S}:=\left\{ 0, 1, \ldots , N\right\} \) has only a finite number of elements, \(\{X_t\}_{t\in \mathbb {Z}}\) is also a positive recurrent Markov chain and hence ergodic. Finally, Theorem 1.3 in Karlin and Taylor (1975) guarantees the existence of the stationary distribution for \(\{X_t \}\). \(\square \)

Proof of Theorem 3.1

In order to prove Theorem 3.1, we need to check all the regularity conditions of Theorems 3.1 and 3.2 in Klimko and Nelson (1978) hold. The regularity conditions for Theorem 3.1 in Klimko and Nelson (1978) are given as follows:

(i) \(\partial g/\partial \theta _i\), \(\partial ^2\,g/\partial \theta _i\partial \theta _j\), \(\partial ^3\,g/\partial \theta _i\partial \theta _j\partial \theta _k\) exist and are continuous for all \(\varvec{\theta }\in \Theta \), where \(\theta _i\), \(\theta _j\), \(\theta _k\) denote the components of \(\varvec{\theta }\), \(i,j,k\in \{1,2, \ldots ,m\}\), g is the abbreviation for \(g(\varvec{\theta },X_{t-1},\ldots ,X_{t-p},\varvec{Z}_t)\), \(m=2q+p+1\) denotes the dimension of \(\varvec{\theta }\);

(ii) For \(i,j \in \{1,2,\ldots ,m\}\), \(E|(X_1-g)\partial g/\partial \theta _i|<\infty \), \(E|(X_1-g)\partial ^2\,g/\partial \theta _i\partial \theta _j|<\infty \) and \(E|\partial g/\partial \theta _i \cdot \partial g/\partial \theta _j|<\infty \), where g and its partial derivatives are evaluated at \(\varvec{\theta }_0\) and the \(\sigma \)-filed generated by all the information before zero time;

(iii) For \(i,j,k \in \{1,2,\ldots ,m\}\) there exist functions \(H^{(0)}(X_0,\ldots ,X_{1-p})\), \(H_i^{(1)}(X_0,\ldots ,X_{1-p})\), \(H_{ij}^{(2)}(X_0,\ldots ,X_{1-p})\), \(H_{ijk}^{(3)}(X_0,\ldots ,X_{1-p})\) such that

$$\begin{aligned} |g|<H^{(0)},~|\partial g/\partial \theta _i|<H_i^{(1)},~ |\partial ^2 g/\partial \theta _i\partial \theta _j|<H_{ij}^{(2)},~ |\partial ^3 g/\partial \theta _i\partial \theta _j\partial \theta _k|<H_{ijk}^{(3)}, \end{aligned}$$
(B.1)

for all \(\varvec{\theta }\in \Theta \), and

$$\begin{aligned} \begin{array}{l} E|X_1\cdot H_{ijk}^{(3)}(X_0,\ldots ,X_{1-p})|<\infty ,\\ E(H^{(0)}(X_0,\ldots ,X_{1-p})\cdot H_{ijk}^{(3)}(X_0,\ldots ,X_{1-p}))<\infty ,\\ E(H_i^{(1)}(X_0,\ldots ,X_{1-p})\cdot H_{ij}^{(2)}(X_0,\ldots ,X_{1-p}))<\infty . \end{array} \end{aligned}$$
(B.2)

Recall that \(g(\varvec{\theta },X_{t-1},\ldots ,X_{t-p},\varvec{Z}_t)=\sum _{i=1}^p \phi _i \left( \frac{\exp (\varvec{Z}_t^{\top } \varvec{\delta }_1)}{1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_1)}X_{t-i}+ \frac{\exp (\varvec{Z}_t^{\top } \varvec{\delta }_2)}{1+\exp (\varvec{Z}_t^{\top } \varvec{\delta }_2)}(N-X_{t-i})\right) \). It is easy to check that condition (i) holds. Denote by \(p_k:=P(X_t=k)\), \(k=0,1,\ldots ,N\). Thus, we have for any fixed \(s \ge 1\),

$$\begin{aligned} E(|X_t|^s)=E(X_t^s)=\sum _{k=0}^N p_k \cdot k^s \le \sum _{k=0}^N p_k \cdot N^s =N^s <\infty . \end{aligned}$$
(B.3)

By Definition 1, \(\varvec{Z}_t\) has a finite covariance matrix, which, together with (B.3) ensures that condition (ii) holds. Denote by \(\varvec{x}=(x_1,\ldots ,x_n)^{\top }\) a n-dimensional vector, further denote by \(\Vert \varvec{x}\Vert _1=|x_1|+|x_2|+\ldots +|x_n|\), \(\Vert \varvec{x}\Vert _{\infty }=\max _{1 \le i \le n}|x_i|\). Let

$$\begin{aligned} \begin{array}{ll} H^{(0)}(X_0,\ldots ,X_{1-p})=N, &{} H_i^{(1)}(X_0,\ldots ,X_{1-p})=N \Vert \varvec{Z}_t\Vert _1,\\ H_{ij}^{(2)}(X_0,\ldots ,X_{1-p})=N \Vert \varvec{Z}_t\Vert _{\infty }^2, &{} H_{ijk}^{(3)}(X_0,\ldots ,X_{1-p})=N \Vert \varvec{Z}_t\Vert _{\infty }^3, \end{array} \end{aligned}$$
(B.4)

then for any \(i,j,k \in \{1,2,\ldots ,m\}\), we can verify that (B.1) holds. Moreover, \(E\Vert \varvec{Z}_t\Vert ^3 < \infty \) and (B.4) imply that (B.2) holds, which implies the \(\hat{\varvec{\theta }}_{CLS}\) is a strongly consistent estimator. With the fact that \(X_t-g\) is bounded by 2N, we obtain that \(U_t(\varvec{\theta })\) is bounded by \(4N^2\). Together with condition (ii), we have that

$$\begin{aligned} E(U_1(\varvec{\theta })|\partial g/\partial \theta _i \cdot \partial g/\partial \theta _j|)<\infty ,~i,j \in \{1,2,\ldots ,m\}. \end{aligned}$$

Therefore, the regularity conditions for Theorem 3.2 in Klimko and Nelson (1978) hold are also hold, implying the asymptotic normality for \(\hat{\varvec{\theta }}_{CLS}\). \(\square \)

Proof of Theorem 3.2

To prove Theorem 3.2, we first give an equivalent representation of the RCMBAR(p)-X process. We begin with some notations. Let \(\varvec{Y}_t:=(X_t,X_{t-1},\ldots ,X_{t-p+1})^{\top }\), \(\alpha _{i,t}=D_{t,i}\alpha _t\), \(\beta _{i,t}=D_{t,i}\beta _t\), \(i=1,2,\ldots ,p\), and further denote two pth-order matrices \(\varvec{\alpha }_t\) and \(\varvec{\beta }_t\) as

$$\begin{aligned} \varvec{\alpha }_t:= \left( \begin{array}{ccccc} \alpha _{1,t} &{} \alpha _{2,t} &{} \ldots &{} \alpha _{p-1,t} &{} \alpha _{p,t}\\ 1 &{} 0 &{} \ldots &{} 0 &{} 0 \\ 0 &{} 1 &{} \ldots &{} 0 &{} 0\\ \vdots &{} \vdots &{} &{} \vdots &{} \vdots \\ 0 &{} 0 &{} \ldots &{} 1 &{} 0\\ \end{array}\right)\quad \text{ and }\quad \varvec{\beta }_t:= \left( \begin{array}{ccccc} \beta _{1,t} &{} \beta _{2,t} &{} \ldots &{} \beta _{p-1,t} &{} \beta _{p,t}\\ 0 &{} 0 &{} \ldots &{} 0 &{} 0 \\ 0 &{} 0 &{} \ldots &{} 0 &{} 0\\ \vdots &{} \vdots &{} &{} \vdots &{} \vdots \\ 0 &{} 0 &{} \ldots &{} 0 &{} 0\\ \end{array}\right) . \end{aligned}$$

With the fact that \(0\circ X=0\) and \(1\circ X=X\) (see Lemma 1 in Silva and Oliveira 2004), model (2.2) can be written in the following form:

$$\begin{aligned} \varvec{Y}_t=\varvec{\alpha }_t \circ \varvec{Y}_{t-1} + \varvec{\beta }_t \circ (\varvec{N}-\varvec{Y}_{t-1}), \end{aligned}$$
(B.5)

where \(\varvec{N}^{\top } = (N,\ldots ,N)_{1\times p}\), “\(\circ \)” operation here denotes a matrix operation which acts as the usual matrix multiplication while replacing scaler multiplication with the binomial thinning operation. Thus, we obtain a multivariate version binomial autoregressive model with state space

$$\begin{aligned} \mathbb {S}:=\underbrace{\{0, 1, \ldots , N \} \times \{0, 1, \ldots , N \} \ldots \times \{0, 1, \ldots , N \}}_{p \text{ multiple } \text{ cartesian } \text{ product }}, \end{aligned}$$

i.e., \(\mathbb {S}=\{(s_1,\ldots ,s_p)| s_j \in \{0, 1, \ldots , N\}, j=1,2, \ldots ,p\}\). Denote by \(P_{t|t-1}(\varvec{\theta }):=P(\varvec{Y}_t=\varvec{y}_t|\varvec{Y}_{t-1}=\varvec{y}_{t-1})\) the transition probability of \(\{\varvec{Y}_t\}\). Thus, we have

$$\begin{aligned} P_{t|t-1}(\varvec{\theta }):&\quad =P(\varvec{Y}_t=\varvec{y}_t|\varvec{Y}_{t-1}=\varvec{y}_{t-1},\varvec{Z}_t)\nonumber \\&\quad =P(X_t=x_t,\ldots ,X_{t-p+1}=x_{t-p+1}|X_{t-1}=x_{t-1},\ldots ,X_{t-p}=x_{t-p},\varvec{Z}_t)\nonumber \\&\quad =P(X_t=x_t|X_{t-1}=x_{t-1},\ldots ,X_{t-p}=x_{t-p},\varvec{Z}_t). \end{aligned}$$
(B.6)

Equations (B.6) imples models (2.2) and (B.5) have the same transition probabilities, and also have the same CML estimators accordingly. Therefore, we can use the result in Billingsley (1961) to prove Theorem 3.2. To this end, we need to verify that Condition 5.1 in Billingsley (1961) holds. Condition 5.1 of Billingsley (1961) is fulfilled provided that:

1. The set D of \((\varvec{k},\varvec{l})\) such that \(P_{t|t-1}(\varvec{\theta })=P(\varvec{Y}_t=\varvec{k}|\varvec{Y}_{t-1}=\varvec{l},\varvec{Z}_t)>0\) is independent of \(\varvec{\theta }\);

2. Each \(P_{t|t-1}(\varvec{\theta })\) has continuous partial derivatives of third order throughout \(\Theta \);

3. The \(d\times r\) matrix

$$\begin{aligned} (\partial P_{t|t-1}(\varvec{\theta })/\partial \theta _{u}),~u=1,\ldots ,r, \end{aligned}$$
(B.7)

(d being the number of elements in D) has rank r throughout \(\Theta \), \(r:=\dim (\Theta )\).

4. For each \(\varvec{\theta }\in \Theta \) there is only one ergodic set and there no transient states.

Conditions 1 and 2 are easily fulfilled by (2.6). For any \(\varvec{\theta }\), we can select a r dimensional square matrix of rank r from the \(d\times r\) dimensional matrix (B.7), then Condition 3 is also true. Since the state space of (B.5) is a finite set, and \(P_{t|t-1}(\varvec{\theta })>0\), then Condition 4 holds. Thus, Conditions 1 to 4 are all fulfilled, which implies Condition 5.1 in Billingsley (1961) holds. Thereby, the CML-estimators \(\hat{\varvec{\theta }}_{CML}\) are strongly consistent and asymptotically normal.

The proof is complete. \(\square \)

Appendix C: the autocovariance function

Based on model (B.5), we derive the autocovariance function in the following form. Denote by \(\varvec{\gamma }(k):=\textrm{Cov}(\varvec{Y}_t,\varvec{Y}_{t-k})\). By the law of total covariance, we have

$$\begin{aligned} \varvec{\gamma }(k)&=\textrm{Cov}(E(\varvec{Y}_t|\varvec{Y}_{t-1},\ldots ),E(\varvec{Y}_{t-k}|\varvec{Y}_{t-1},\ldots ))+0\\&\quad =\textrm{Cov}(E(\varvec{\alpha }_t)\varvec{Y}_{t-1}+E(\varvec{\beta }_t)(\varvec{N}-\varvec{Y}_{t-1}),\varvec{Y}_{t-k})\\&\quad =E(\varvec{\alpha }_t)\textrm{Cov}(\varvec{Y}_{t-1},\varvec{Y}_{t-k})-E(\varvec{\beta }_t)\textrm{Cov}(\varvec{Y}_{t-1},\varvec{Y}_{t-k})\\&\quad =(E(\varvec{\alpha }_t)-E(\varvec{\beta }_t))\varvec{\gamma }(k-1)\\&\quad =\ldots \\&\quad =(E(\varvec{\alpha }_t)-E(\varvec{\beta }_t))^k \varvec{\gamma }(0). \end{aligned}$$

The above representation gives a measure of autocorrelation property of (B.5), which further gives a measure for the RCMBAR(p)-X process.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, H., Liu, Z., Yang, K. et al. A pth-order random coefficients mixed binomial autoregressive process with explanatory variables. Comput Stat (2023). https://doi.org/10.1007/s00180-023-01396-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00180-023-01396-8

Keywords

Navigation