Advertisement

Spline estimator for ultra-high dimensional partially linear varying coefficient models

  • Zhaoliang WangEmail author
  • Liugen Xue
  • Gaorong Li
  • Fei Lu
Article
  • 145 Downloads

Abstract

In this paper, we simultaneously study variable selection and estimation problems for sparse ultra-high dimensional partially linear varying coefficient models, where the number of variables in linear part can grow much faster than the sample size while many coefficients are zeros and the dimension of nonparametric part is fixed. We apply the B-spline basis to approximate each coefficient function. First, we demonstrate the convergence rates as well as asymptotic normality of the linear coefficients for the oracle estimator when the nonzero components are known in advance. Then, we propose a nonconvex penalized estimator and derive its oracle property under mild conditions. Furthermore, we address issues of numerical implementation and of data adaptive choice of the tuning parameters. Some Monte Carlo simulations and an application to a breast cancer data set are provided to corroborate our theoretical findings in finite samples.

Keywords

High dimensionality Partially linear varying coefficient model Variable selection Nonconvex penalty Oracle property 

Notes

Acknowledgements

The authors thank the Editor, the Associate Editor and two anonymous referees for their careful reading and constructive comments which have helped us to significantly improve the paper. Zhaoliang Wang’s research was supported by the Graduate Science and Technology Foundation of Beijing University of Technology (ykj-2017-00276). Liugen Xue’s research was supported by the National Natural Science Foundation of China (11571025, Key grant: 11331011) and the Beijing Natural Science Foundation (1182002). Gaorong Li’s research was supported by the National Natural Sciences Foundation of China (11471029) and the Beijing Natural Science Foundation (1182003).

References

  1. Ahmad, I., Leelahanon, S., Li, Q. (2005). Efficient estimation of a semiparametric partially linear varying coefficient model. The Annals of Statistics, 33, 258–283.Google Scholar
  2. Bickel, P. J., Klaassen, C. A. J., Ritov, Y., Wellner, J. A. (1998). Efficient and adaptive estimation for semiparametric models. New York: Springer.Google Scholar
  3. Bühlmann, P., Van de Geer, S. (2011). Statistics for high dimensional data. Berlin: Springer.Google Scholar
  4. Chen, J. H., Chen, Z. H. (2008). Extended bayesian information criteria for model selection with large model spaces. Biometrika, 95, 759–771.Google Scholar
  5. Cheng, M. Y., Honda, T., Zhang, J. T. (2016). Forward variable selection for sparse ultra-high dimensional varying coefficient models. Journal of the American Statistical Association, 111, 1209–1221.Google Scholar
  6. de Boor, C. (2001). A practical guide to splines. New York: Springer.zbMATHGoogle Scholar
  7. Fan, J. Q., Huang, T. (2005). Profile likelihood inferences on semiparametric varying coefficient partially linear models. Bernoulli, 11, 1031–1057.Google Scholar
  8. Fan, J. Q., Li, R. Z. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348–1360.Google Scholar
  9. Fan, J. Q., Lv, J. C. (2010). A selective overview of variable selection in high dimensional feature space. Statistica Sinica, 20, 101–148.Google Scholar
  10. Fan, J. Q., Lv, J. C. (2011). Non-concave penalized likelihood with NP-dimensionality. IEEE Transactions on Information Theory, 57, 5467–5484.Google Scholar
  11. Feng, S. Y., Xue, L. G. (2014). Bias-corrected statistical inference for partially linear varying coefficient errors-in-variables models with restricted condition. Annals of the Institute of Statistical Mathematics, 66, 121–140.Google Scholar
  12. Huang, J., Horowitz, J. L., Wei, F. R. (2010). Variable selection in nonparametric additive models. The Annals of Statistics, 38, 2282–2313.Google Scholar
  13. Huang, Z. S., Zhang, R. Q. (2009). Empirical likelihood for nonparametric parts in semiparametric varying coefficient partially linear models. Statistics and Probability Letters, 79, 1798–1808.Google Scholar
  14. Kai, B., Li, R. Z., Zou, H. (2011). New efficient estimation and variable selection methods for semiparametric varying coefficient partially linear models. The Annals of Statistics, 39, 305–332.Google Scholar
  15. Knight, W. A., Livingston, R. B., Gregory, E. J., Mc Guire, W. L. (1977). Estrogen receptor as an independent prognostic factor for early recurrence in breast cancer. Cancer Research, 37, 4669–4671.Google Scholar
  16. Koren, Y., Bell, R., Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30–37.Google Scholar
  17. Li, G. R., Feng, S. Y., Peng, H. (2011a). A profile type smoothed score function for a varying coefficient partially linear model. Journal of Multivariate Analysis, 102, 372–385.Google Scholar
  18. Li, G. R., Xue, L. G., Lian, H. (2011b). Semi-varying coefficient models with a diverging number of components. Journal of Multivariate Analysis, 102, 1166–1174.Google Scholar
  19. Li, G. R., Lin, L., Zhu, L. X. (2012). Empirical likelihood for varying coefficient partially linear model with diverging number of parameters. Journal of Multivariate Analysis, 105, 85–111.Google Scholar
  20. Li, R. Z., Liang, H. (2008). Variable selection in semiparametric regression modeling. The Annals of Statistics, 36(1), 261–286.Google Scholar
  21. Li, Y. J., Li, G. R., Lian, H., Tong, T. J. (2017). Profile forward regression screening for ultra-high dimensional semiparametric varying coefficient partially linear models. Journal of Multivariate Analysis, 155, 133–150.Google Scholar
  22. Lustig, M., Donoho, D. L., Santos, J. M., Pauly, J. M. (2008). Compressed sensing MRI. IEEE Signal Processing Magazine, 25, 72–82.Google Scholar
  23. Stone, C. J. (1985). Additive regression and other nonparametric models. The Annals of Statistics, 13, 689–705.MathSciNetCrossRefzbMATHGoogle Scholar
  24. Sun, J., Lin, L. (2014). Local rank estimation and related test for varying coefficient partially linear models. Journal of Nonparametric Statistics, 26, 187–206.Google Scholar
  25. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58, 267–288.MathSciNetzbMATHGoogle Scholar
  26. van’t Veer, L. J., Dai, H. Y., van de Vijver, M. J., He, Y. D., Hart, A. A. M., Mao, M., Peterse, H. L., van der Kooy, K., Marton, M. J., Witteveen, A. T., Schreiber, G. J., Kerkhoven, R. M., Roberts, C., Linsley, P. S., Bernards, R., Friend, S. H., (2002). Gene expression profiling predicts clinical outcome of breast cancer. Nature, 415, 530–536.Google Scholar
  27. Wei, F. R. (2012). Group selection in high dimensional partially linear additive models. Brazilian Journal of Probability and Statistics, 26, 219–243.MathSciNetCrossRefzbMATHGoogle Scholar
  28. Wei, F. R., Huang, J., Li, H. Z. (2011). Variable selection and estimation in high dimensional varying coefficient models. Statistica Sinica, 21, 1515–1540.Google Scholar
  29. Xie, H. L., Huang, J. (2009). SCAD penalized regression in high dimensional partially linear models. The Annals of Statistics, 37, 673–696.Google Scholar
  30. You, J. H., Chen, G. M. (2006a). Estimation of a semiparametric varying coefficient partially linear errors-in-variables model. Journal of Multivariate Analysis, 97, 324–341.Google Scholar
  31. You, J. H., Zhou, Y. (2006b). Empirical likelihood for semiparametric varying coefficient partially linear model. Statistics and Probability Letters, 76, 412–422.Google Scholar
  32. Yu, T., Li, J. L., Ma, S. G. (2012). Adjusting confounders in ranking biomarkers: A model-based ROC approach. Briefings in Bioinformatics, 13, 513–523.Google Scholar
  33. Zhang, C. H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38, 894–942.MathSciNetCrossRefzbMATHGoogle Scholar
  34. Zhao, P. X., Xue, L. G. (2009). Variable selection for semiparametric varying coefficient partially linear models. Statistics and Probability Letters, 79, 2148–2157.Google Scholar
  35. Zhao, W. H., Zhang, R. Q., Liu, J. C., Lv, Y. Z. (2014). Robust and efficient variable selection for semiparametric partially linear varying coefficient model based on modal regression. Annals of the Institute of Statistical Mathematics, 66, 165–191.Google Scholar
  36. Zhou, S., Shen, X., Wolfe, D. A. (1998). Local asymptotics for regression splines and confidence regions. The Annals of Statistics, 26, 1760–1782.Google Scholar
  37. Zhou, Y., Liang, H. (2009). Statistical inference for semiparametric varying coefficient partially linear models with error-prone linear covariates. The Annals of Statistics, 37, 427–458.Google Scholar

Copyright information

© The Institute of Statistical Mathematics, Tokyo 2018

Authors and Affiliations

  • Zhaoliang Wang
    • 1
    • 2
    Email author
  • Liugen Xue
    • 1
  • Gaorong Li
    • 3
  • Fei Lu
    • 1
  1. 1.College of Applied SciencesBeijing University of TechnologyBeijingChina
  2. 2.School of Mathematics and Information ScienceHenan Polytechnic UniversityJiaozuoChina
  3. 3.Beijing Institute for Scientific and Engineering ComputingBeijing University of TechnologyBeijingChina

Personalised recommendations