Variable selection in generalized random coefficient autoregressive models
- 68 Downloads
Abstract
In this paper, we consider the variable selection problem of the generalized random coefficient autoregressive model (GRCA). Instead of parametric likelihood, we use non-parametric empirical likelihood in the information theoretic approach. We propose an empirical likelihood-based Akaike information criterion (AIC) and a Bayesian information criterion (BIC).
Keywords
Empirical likelihood Akaike information criterion Bayesian information criterion Generalized random coefficient autoregressive model Variable selectionMSC
62M10 91B621 Introduction
As a generalization of the usual autoregressive model, the random coefficient autoregressive (RCAR) model (cf. [1, 2]), the Markovian bilinear model and its generalization, and the random coefficient exponential autoregressive model (cf. [3, 4, 5]), model (1) was first introduced by Hwang and Basawa [6]. GRCA has become one of the important models in the nonlinear time series context. In recent years, GRCA has been studied by many authors. For instance, Hwang and Basawa [7] established the local asymptotic normality of a class of generalized random coefficient autoregressive processes. Carrasco and Chen [8] provided the tractable sufficient conditions that simultaneously imply strict stationarity, finiteness of higher-order moments, and β-mixing with geometric decay rates. Zhao and Wang [9] constructed confidence regions for the parameters of model (1) by using an empirical likelihood method. Furthermore, Zhao et al. [10] also considered the problem of testing the constancy of the coefficients in the stationary one-order generalized random coefficient autoregressive model. In this paper, we consider the variable selection problem of the GRCA based on the empirical likelihood method.
Many model selection procedures have been proposed in the statistical literature, including the adjusted \(R^{2}\) (see Theil [11]), the AIC (see Akaike [12]), BIC (see Schwarz [13]), Mallow’S \(C_{p}\) (see Mallows [14]). Other criteria in the literature include Hannan and Quinn’s criterion [15], Geweke and Meese’s criterion [16], Cavanaugh’s Kullback information criterion [17], and the deviance information criterion of Spiegelhalter et al. [18]. Also, Tsay [19], Hurvich and Tsai [20] and Pötscher [21] have studied model selection methods in time series models. Recently, the model selection problem has been extended to moment selection as in Andrews [22], Andrews and Lu [23] and Hong et al. [24]. These model selection methods are concerned with parsimony, as was stressed in Zellner et al. [25], as well as accuracy or power in choosing models.
In this paper, we develop an information theoretic approach to variable selection problem of GRCA. Specifically, instead of parametric likelihood, we use non-parametric empirical likelihood (see Owen [26, 27]) in the information theoretic approach. We propose an empirical likelihood-based Akaike information criterion (EAIC) and a Bayesian information criterion (EBIC).
The paper proceeds as follows. The next section is concerned with the methodology and the main results. Section 3 is devoted to the proofs of the main results.
Throughout the paper, we use the symbols “\(\stackrel{d}{\longrightarrow}\)” and “\(\stackrel{p}{\longrightarrow }\)” to denote convergence in distribution and convergence in probability, respectively. We abbreviate “almost surely” and “independent identical distributed” to “a.s.” and “i.i.d.”, respectively. \(o_{p}(1)\) means a term which converges to zero in probability. \(O_{p}(1)\) means a term which is bounded in probability. Furthermore, the Kronecker product of the matrices A and B is denoted by \(A \otimes B\), and \(\Vert M \Vert \) denotes the \(L_{2}\) norm for vector or matrix M.
2 Methods and main results
In this section, we will first propose the empirical likelihood-based information criteria for choice of a GRCA, then we investigate the asymptotic properties of the new variable selection method.
2.1 Empirical likelihood-based information criteria
The definition of \(\tilde{l}(\phi )\) relies on finding a positive \({p_{t}}'s\) such that \(\sum_{t=1}^{n}p_{t}G_{t}(\phi )=0\) for each ϕ. The solution exists if and only if the convex hull of the \(G_{t}(\phi )\), \(t=1, 2, \ldots , n\) contains zero as an inner point. When the model is correct, the solution exists with probability tending to 1 as the sample size \(n\rightarrow \infty \) for ϕ in a neighborhood of \(\phi_{0}\). However, for finite n and at some ϕ value, the equation often does not have a solution in \(p_{t}\). To avoid this problem, we introduce the adjusted empirical likelihood.
Since 0 always lies on the line connecting \(\bar{G}_{n}\) and \({G}_{n+1}\), the adjusted empirical log-likelihood ratio function is well defined after adding a pseudo-value \({G}_{n+1}\) to the data set. The adjustment is particularly useful so that a numerical program does not crash simply because some undesirable ϕ is assessed.
After \(l(s)\) is evaluated for all s, we select the model with the minimum EAIC or EBIC value.
2.2 Asymptotic properties
It is well known that under some mild conditions the parametric BIC is consistent for variable selection while the parametric AIC is not. Similarly, we can prove that, when p is constant, EBIC is consistent but EAIC is not.
- \(\mathbf{(A_{1})}\)
-
All the eigenvalues of the matrix \(E(C_{t}\otimes C _{t})+(B\otimes B)\) are less than unity in modulus.
- \(\mathbf{(A_{2})}\)
-
\(EY_{t}^{6}<\infty \).
Remark 1
As for the condition \(\mathbf{(A_{1})}\) and the sufficient condition for \(E\vert y_{t} \vert ^{2m}<\infty\) (\(m=1, 2, \ldots\)), we refer to Hwang and Basawa [6].
Theorem 2.1
Note that when a submodel s is a true model, it implies \(\phi_{0} ^{[\bar{s}]}=0\). That is, components of \(\phi_{0}\) not in s are zero. Therefore, \(Y_{t}\) only relates to the variables in positions specified by s. The following theorem shows that when \(\phi_{0}^{[\bar{s}]}=0\) is true, then adjusted empirical log-likelihood ratio statistic has a chi-squared limiting distribution with k fewer degrees of freedom.
Theorem 2.2
Assume that\(\mathbf{(A_{1})}\)and\(\mathbf{(A_{2})}\)hold and\(\phi_{0} ^{[\bar{s}]}=0\)for a submodelsof size k. Then when\(a_{n}=o _{p}(n^{\frac{1}{2}})\), we have\(l(s)\rightarrow \chi^{2}_{p-k}\)in distribution as\(n\rightarrow \infty \).
When the null hypothesis of \(\phi_{0}^{[\bar{s}]}=0\) is not true, the likelihood ratio go to ∞ as \(n\rightarrow \infty \). We state the following theorem in terms of the adjusted empirical likelihood which also applies to the usual empirical likelihood.
Theorem 2.3
Assume that\(\mathbf{(A_{1})}\)and\(\mathbf{(A_{2})}\)hold and\(a_{n}=o_{p}(n ^{\frac{1}{2}})\). Then for any\(\phi \neq \phi_{0}\)such that\(E(G_{t}(\phi ))\neq 0\), \(l(s)\rightarrow \infty \)in probability as\(n\rightarrow \infty \).
The following theorem indicates that, when p is constant, EBIC is consistent but EAIC is not.
Theorem 2.4
Assume that\(\mathbf{(A_{1})}\)and\(\mathbf{(A_{2})}\)hold and if there exists a subset\(s_{0}\)of\(1, 2, \ldots , p\)such that, for any other subsets, \(E(G^{[s]}_{t}(\phi^{[s]}))=0\)for someϕif and only ifscontains\(s_{0}\). Then, EBIC is consistent and EAIC is not consistent.
3 Proofs of the main results
In order to prove Theorem 2.1, we first present several lemmas.
Lemma 3.1
Assume that\(\mathbf{(A_{1})}\)and\(\mathbf{(A_{2})}\)hold. ThenAis positive definite andBhas rank p.
Proof
Similarly, we can also prove that B has rank p. The proof of Lemma 3.1 is thus complete. □
Lemma 3.2
Proof
In what follows, we consider \(\Vert \frac{1}{n}\sum_{t=1}^{n}G_{t}(\phi_{0}) \Vert \).
Lemma 3.3
Proof
Lemma 3.4
Proof
Lemma 3.5
The proof is similar to the proof of Lemma 1 of Qin and Lawless [28], so we omit the details.
Proof of Theorem 2.1
Proof of Theorem 2.2
Proof of Theorem 2.3
Proof of Theorem 2.4.
First, we consider EAIC. Consider the situation when \(s_{0}\) is empty. Let \(s=\{1\}\) which contains a single covariant. Based on expansion in the proof of Theorem 2.2, we can prove that \(l(s_{0})-l(s)\rightarrow \chi^{2}_{1}\), which implies that \(\lim_{n\rightarrow \infty }P(l(s_{0})-l(s)>2)>0\). Therefore, EAIC is not consistent.
4 Conclusions
It should be pointed out that variable selection has always been an important problem for our statistician. Many variable selection methods have been proposed in the statistical literature. But for the variable selection method of GRCA, so far it has not been provided by statistician. In this paper, instead of parametric likelihood, we further propose an Akaike information criterion (EAIC) and a Bayesian information criterion (EBIC) for the variable selection problem of GRCA based on the empirical likelihood method. Moreover, we also prove that under some mild conditions the parametric EBIC is consistent, while the parametric EAIC is not when p is constant.
Notes
Acknowledgements
This work is supported by National Natural Science Foundation of China (No. 11571138, 11671054, 11301137, 11271155, 11371168, J1310022, 11501241), the National Social Science fund of China (16BTJ020), Science and Technology Research Program of Education Department in Jilin Province for the 12th Five-Year Plan (440020031139). “Thirteenth Five-Year Plan” Science and Technology Research Project of the Education of Jilin Province (Grant No. 2016103) and Jilin Province Natural Science Foundation (20130101066JC, 20130522102JH, 20150520053JH).
Authors’ contributions
All the authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
References
- 1.Nicholls, D.F., Quinn, B.G.: Random Coefficient Autoregressive Models: An Introduction. Springer, New York (1982) CrossRefMATHGoogle Scholar
- 2.Tong, H.: Nonlinear Time Series. Oxford University Press, Oxford (1990) Google Scholar
- 3.Tong, H.: A note on a Markov bilinear stochastic process in discrete time. J. Time Ser. Anal. 2, 279–284 (1981) MathSciNetCrossRefMATHGoogle Scholar
- 4.Feigin, P.D., Tweedie, R.L.: Random coefficient autoregressive processes: a Markov chain analysis of stationarity and finiteness of moments. J. Time Ser. Anal. 6, 1–14 (1985) MathSciNetCrossRefMATHGoogle Scholar
- 5.Hwang, S.Y., Basawa, I.V.: Asymptotic optimal inference for a class of nonlinear time series models. Stoch. Process. Appl. 46, 91–113 (1993) MathSciNetCrossRefMATHGoogle Scholar
- 6.Hwang, S.Y., Basawa, I.V.: Parameter estimation for generalized random coefficient autoregressive processes. J. Stat. Plan. Inference 68, 323–327 (1998) MathSciNetCrossRefMATHGoogle Scholar
- 7.Hwang, S.Y., Basawa, I.V.: The local asymptotic normality of a class of generalized random coefficient autoregressive processes. Stat. Probab. Lett. 34, 165–170 (1997) MathSciNetCrossRefMATHGoogle Scholar
- 8.Carrasco, M., Chen, X.: β-Mixing and moment properties of RCA models with application to \(\operatorname{GARCH}(p, q)\). C. R. Acad. Sci., Sér. 1 Math. 331, 85–90 (2000). MathSciNetMATHGoogle Scholar
- 9.Zhao, Z.W., Wang, D.H.: Statistical inference for generalized random coefficient autoregressive model. Math. Comput. Model. 56, 152–166 (2012) MathSciNetCrossRefMATHGoogle Scholar
- 10.Zhao, Z.W., Wang, D.H., Peng, C.X.: Coefficient constancy test in generalized random coefficient autoregressive model. Appl. Math. Comput. 219, 10283–10292 (2013) MathSciNetMATHGoogle Scholar
- 11.Theil, H.: Economic Forecasts and Policy. North-Holland, Amsterdam (1961) Google Scholar
- 12.Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. Control 19, 716–723 (1974) MathSciNetCrossRefMATHGoogle Scholar
- 13.Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978) MathSciNetCrossRefMATHGoogle Scholar
- 14.Mallows, C.L.: Some comments on \(C_{p}\). Technometrics 15, 661–675 (1973) MATHGoogle Scholar
- 15.Hannan, E.J., Quinn, B.G.: The determination of the order of an autoregression. J. R. Stat. Soc., Ser. B, Stat. Methodol. 41, 190–195 (1979) MathSciNetMATHGoogle Scholar
- 16.Geweke, J., Meese, R.: Estimating regression models of finite but unknown order. Int. Econ. Rev. 16, 55–70 (1981) MathSciNetCrossRefMATHGoogle Scholar
- 17.Cavanaugh, J.E.: A large-sample model selection criterion based on kull-backs symmetric divergence. Stat. Probab. Lett. 42, 333–343 (1999) CrossRefMATHGoogle Scholar
- 18.Spiegelhalter, D.J., Best, N.G., Carlin, B.P., Linde, A.V.D.: Bayesian measures of model complexity and fit. J. R. Stat. Soc., Ser. B, Stat. Methodol. 64, 583–639 (2002) MathSciNetCrossRefMATHGoogle Scholar
- 19.Tsay, R.S.: Order selection in nonstationary autoregressive models. Ann. Stat. 12, 1151–1596 (1984) MathSciNetCrossRefMATHGoogle Scholar
- 20.Hurvich, C.M., Tsai, C.L.: Regression and time series model selection in small samples. Biometrika 76, 297–307 (1989) MathSciNetCrossRefMATHGoogle Scholar
- 21.Pötscher, B.M.: Model selection under nonstationarity: autoregressive models and stochastic linear regression models. Ann. Stat. 17, 1257–1274 (1989) MathSciNetCrossRefMATHGoogle Scholar
- 22.Andrews, D.W.K.: Consistent moment selection procedures for generalized method of moments estimation. Econometrica 67, 543–564 (1999) MathSciNetCrossRefMATHGoogle Scholar
- 23.Andrews, D.W.K., Lu, B.: Consistent model and moment selection criteria for GMM estimation with applications to dynamic panel models. J. Econom. 101, 123–164 (2001) CrossRefMATHGoogle Scholar
- 24.Hong, H., Preston, B., Shum, M.: Generalized empirical likelihood-based model selection criteria for moment condition models. Econ. Theory 19, 923–943 (2003) MathSciNetGoogle Scholar
- 25.Zellner, A., Keuzenkamp, H.A., Mcaleer, M.: Simplicity, Inference and Modelling: Keeping It Sophisticatedly Simple. Cambridge University Press, Cambridge (2001) MATHGoogle Scholar
- 26.Owen, A.B.: Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75, 237–249 (1988) MathSciNetCrossRefMATHGoogle Scholar
- 27.Owen, A.B.: Empirical Likelihood. Chapman and Hall, New York (2001) CrossRefMATHGoogle Scholar
- 28.Qin, J., Lawless, J.: Empirical likelihood and general estimating equations. Ann. Stat. 22, 300–325 (1994) MathSciNetCrossRefMATHGoogle Scholar
Copyright information
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.