Abstract
In this paper, a focused vector information criterion for model selection and model averaging is considered for the linear model with missing response. Based on the focused information criterion of Hjort and Claeskens (J Am Stat Assoc 98:879–945, 2003) and imputation idea, a frequentist model averaging estimator for a focused vector of a linear model is proposed, and the estimator is shown to be root-n consistent and asymptotical normal. In addition, the proposed focused vector information criterion is designed for focused multidimensional parameter, which is a little different from conventional focused information criterion for one dimensional focused parameter. A model averaging based confidence interval estimation method and estimation of the mean of the response are also proposed. A simulation study is conducted to investigate the performance of the proposed estimator with finite sample sizes and a real data example is presented to illustrate its application in practice.
Similar content being viewed by others
References
Akaike H (1973) Maximum likelihood identification of Gaussian autoregressive moving average models. Biometrika 22:203–217
Azar B (2002) Finding a solution for missing data. Monit Psychol 33:70
Bradic J, Fan J, Wang W (2011) Penalized composite quasi-likelihood for ultrahigh-dimensional variable selection. J Roy Stat Soc Ser B 73:325–349
Cavanaugh J, Shumway R (1998) An akaike information criterion for model selection in the presence of incomplete data. J Stat Plan Inf 67:45–65
Claeskens G, Consentino F (2008) Variable selection with incomplete covariate data. Biometrics 64:1062–1069
Du J, Zhang ZZ, Xie TF (2012) Model averaging in quantile regression. Communications in Statistics-Theory and Methods, to appear
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360
Fan JQ, Li RZ (2004) New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis. J Am Stat Assoc 99:710–723
Hens N, Aerts MGM (2006) Model selection for incomplete and design based samples. Stat Med 25:2502–2520
Huang JZ, Wu CO, Zhou L (2002) Varying-coecient models and basis function approximations for the analysis of repeated measurements. Biometrika 89:111–128
Hjort NL, Claeskens G (2003) Frequentist model average estimators (with discussion). J Am Stat Assoc 98:879–945
Hjort NL, Claeskens G (2006) Focussed information criteria and model averaging for Coxs hazard regression model. J Am Stat Assoc 101:1449–1464
Jones MP (1996) Indicator and stratification methods for missing explanatory variables in multiple linear regression. J Am Stat Assoc 91:222–230
Liang H, Wang S, Carroll RJ (2007) Partially linear models with missing response variables and error-prone covariates. Biometrika 94:185–198
Little RJA, Rubin DB (1987) Statistical analysis with missing data. Wiley, New York
Leeb H, Potscher BM (2003) The finite sample distribution of post-model-selection estimators and uniform versus non-uniform approximations. Econ Theory 19:100–142
Leeb H, Potscher BM (2008) Can one estimate the unconditional distribution of post-model-selection estimators. Econ Theory 24:38–376
Leung G, Barron AR (2006) Information theory and mixing least-squares regressions. IEEE Trans Inf Theory 52:3396–3410
Meinshausen N, Buhlmann P (2006) High dimensional graphs and variable selection with the lasso. Ann Stat 34:1436–1462
Meinshausen N, Yu B (2009) Lasso-type recovery of sparse representations for high-dimensional data. Ann Stat 37:246–270
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Schomakera M, Wan ATK (2010) Frequentist model averaging with missing observations. Comput Stat Data Anal 54:3336–33474
Sun ZM, Zhang ZZ, Du J (2012) Semiparametric analysis of isotonic errors-in-variables regression models with missing response. Commun Stat Theory Methods 41:2034–2060
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc Ser B 58:267–288
Van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes with application to statistics. Springer, Berlin
Wang QH, Sun ZH (2007) Estimation in partially linear models with missing responses at random. J Multivar Anal 98:1470–1493
Wang H, Zhou, Z FS (2012) Interval estimation by frequentist model averaging. Communications in Statistics-Theory and Methods, forthcoming
Xue LG (2009) Empirical likelihood for linear models with missing responses. J Multivar Anal 100:1353–1366
Yang Y (2001) Adaptive regression by mixing. J Am Stat Assoc 96:574–586
Yang YP, Xue LG, Cheng WH (2009) Empirical likelihood for a partially linear model with covariate data missing at random. J Stat Plan Inf 139:4143–4153
Zhang H, Wahba G, Lin Y, Voelker M, Ferris M, Klein R, Klein B (2004) Variable selection and model building via likelihood basis pursuit. J Am Stat Assoc 99:659–672
Zhang CH, Huang J (2008) The sparsity and bias of the lasso selection in high-dimensional linear regression. Ann Stat 36:1567–1594
Zhang XY, Liang H (2011) Focused information criterion and model averaging for generalized additive partial linear models. Ann Stat 39(1):174–200
Zhao PX, Xue LG (2010) Variable selection for semiparametric varying coefficient partially linear errors-in-variables models. J Multivar Anal 101(8):1872–1883
Acknowledgments
The author is grateful to anonymous referees for their careful reading and insightful comments on this paper. This work is supported by National Natural Science Foundation of China (No. 71101157); Program for Innovation Research in Central University of Finance and Economics; 2012 National Project of Statistical Research (2012LY138); Foundation of Academic Discipline Program at Central University of Finance and Economics; MOE (Ministry of Education in China) Project of Humanities and Social Sciences For Youth (10YJC790220); Fund of 211 Project at Central University of Finance and Economics.
Author information
Authors and Affiliations
Corresponding author
Appendix: Proofs
Appendix: Proofs
The following Lemma 1 is needed to prove the theorems.
Lemma 1
Under condition C1, we have
Proof of Lemma 1
It is easily seen that
Note that
Lemma 1 then follows Directly.
Proof of Theorem 1
Denote \(A_{n}=\frac{1}{n}\sum _{i=1}^{n}\Pi _SX_{i}X_{i}^{\top }\Pi _S^{\top }\), with a simple calculation, we obtain
where
Under Condition C1, it is not difficult to get
Thus,with Lemma 1 in hand, we have
The second last equation follows from \((I-\Pi _S^{\top }\Pi _S)\left( \begin{array}{c} \ddot{ \beta }_0\\ 0 \end{array} \right) =0\). Thus, we have
Note that
By the central limit theorem and slutsky’ theorem, we have
And a slight transformation of the last equation completes the proof of Theorem 1.
Proof of Theorem 2
Theorem 2 is not difficult to arrive by the delta method and Theorem 3 of Van der Vaart and Wellner (1996) based on Theorem 1.
Proof of Theorem 3
It can be verified that
Since \(\hat{\eta }\stackrel{d}{\longrightarrow }\Delta \), we have
by the continuous mapping theorem, and then Theorem 3 follows directly.
Proof of Theorem 4
It is easily seen that
For \(H_{n3}\), we have
Note that
Plugging (3) into (2), we have
Since the term \(\frac{1}{n}\sum _{i=1}^{n}(1-\delta _{i})X^{\top }_i\stackrel{P}{\longrightarrow }(EX^{\top }-E\delta X^{\top })\), it is not difficult to get
This together with \(H_{n1}\) and \(H_{n2}\) implies that
Then, Theorem 4 follows from the central limit theorem. This completes the whole of the proof.
Rights and permissions
About this article
Cite this article
Sun, Z., Su, Z. & Ma, J. Focused vector information criterion model selection and model averaging regression with missing response. Metrika 77, 415–432 (2014). https://doi.org/10.1007/s00184-013-0446-8
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-013-0446-8