Abstract
In this paper, we develop a flexible semiparametric model averaging marginal regression procedure to forecast the joint conditional quantile function of the response variable for ultrahigh-dimensional data. First, we approximate the joint conditional quantile function by a weighted average of one-dimensional marginal conditional quantile functions that have varying coefficient structures. Then, a local linear regression technique is employed to derive the consistent estimates of marginal conditional quantile functions. Second, based on estimated marginal conditional quantile functions, we estimate and select the significant model weights involved in the approximation by a nonconvex penalized quantile regression. Under some relaxed conditions, we establish the asymptotic properties for the nonparametric kernel estimators and oracle estimators of the model averaging weights. We further derive the oracle property for the proposed nonconvex penalized model averaging procedure. Finally, simulation studies and a real data analysis are conducted to illustrate the merits of our proposed model averaging method.
Similar content being viewed by others
References
Ando, T., Li, K.-C.: A model-averaging approach for high-dimensional regression. J. Amer. Statist. Assoc., 109, 1–46 (2014)
Ando, T., Li, K.-C.: A weight-relaxed model averaging approach for high-dimensional generalized linear models. Ann. Statist., 45, 2654–2679 (2017)
Bradic, J., Fan, J., Wang, W.: Penalized composite quasi-likelihood for ultrahigh dimensional variable selection. J. R. Stat. Soc. Ser. B Stat. Methodol., 73, 325–349 (2011)
Chen, J., Li, D., Linton, O., et al.: Semiparametric dynamic portfolio choice with multiple conditioning variables. J. Econometrics, 194, 309–318 (2016)
Chen, J., Li, D., Linton, O., et al.: Semiparametric ultra-high dimensional model averaging of nonlinear dynamic time series. J. Amer. Statist. Assoc., 113, 919–932 (2018)
Cheng, M., Honda, T., Zhang, J.: Forward variable selection for sparse ultra-high dimensional varying coefficient models. J. Amer. Statist. Assoc., 111, 1209–1221 (2016)
Fan, J., Fan, Y., Barut, E.: Adaptive robust variable selection. Ann. Statist., 42, 324–351 (2014)
Fan, J., Gijbels, I.: Local Polynomial Modelling and Its Applications, Chapman & Hall, London, 1996
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc., 96, 1348–1360 (2001)
Fan, J., Lv, J.: A selective overview of variable selection in high dimensional feature space. Statist. Sinica, 20, 101–148 (2010)
Fan, J., Lv, J.: Nonconcave penalized likelihood with NP-dimensionality. IEEE T. Inform. Theory, 57, 5467–5484 (2011)
Fan, J., Ma, Y., Dai, W.: Nonparametric independence screening in sparse ultra-high-dimensional varying coefficient models. J. Amer. Statist. Assoc., 109, 1270–1284 (2014)
Fan, J., Peng, H.: On non-concave penalized likelihood with diverging number of parameters. Ann. Statist., 32, 928–961 (2004)
Frumento, P., Bottai, M.: Parametric modeling of quantile regression coefficient functions. Biometrics, 72, 74–84 (2016)
Hansen, B. E.: Least squares model averaging. Econometrica, 75, 1175–1189 (2007)
Hastie, T., Tibshirani, R.: Varying-coefficient model. J. R. Stat. Soc. Ser. B Stat. Methodol., 55, 757–796 (1993)
Hastie, T., Tibshirani, R., Wainwright, M.: Statistical Learning with Sparsity: the Lasso and Generalizations, Chemical Rubber Company Press, Taylor & Francis Group, 2015
He, X., Wang, L., Hong, H. G.: Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data. Ann. Statist., 41, 342–369 (2013)
Hu, T., Xia, Y.: Adaptive semi-varying coefficient model selection. Statist. Sinica, 22, 575–599 (2012)
Hunter, D., Lange, K.: Quantile regression via an MM algorithm. J. Comput. Graph. Statist., 9, 60–77 (2000)
Jiang, J., Zhao, Q., Hui, Y. V.: Robust modelling of ARCH models. J. Forecasting, 20, 111–133 (2001)
Kai, B., Li, R., Zou, H.: New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann. Statist., 39, 305–332 (2011)
Kim, Y., Choi, H., Oh, H. S.: Smoothly clipped absolute deviation on high dimensions. J. Amer. Statist. Assoc., 103, 1665–1673 (2008)
Knight, K.: Limiting distributions for L1 regression estimators under general conditions. Ann. Statist., 26, 755–770 (1998)
Koenker, R.: Quantile Regression, Cambridge University Press, Cambridge, 2005
Lee, E. R., Noh, H., Park, B. U.: Model selection via Bayesian information criterion for quantile regression models. J. Amer. Statist. Assoc., 109, 216–229 (2014)
Li, C., Li, Q., Racine, J., et al.: Optimal model averaging of varying coefficient models. Statist. Sinica, 28, 2795–2809 (2018)
Li, D., Linton, O., Lu, Z.: A flexible semiparametric forecasting model for time series. J. Econometrics, 187, 345–357 (2015)
Li, J., Xia, X., Wong, W., et al.: Varying-coefficient semiparametric model averaging prediction. Biometrics, 74, 1417–1426 (2018)
Li, X., Ma, X., Zhang, J.: Conditional quantile correlation screening procedure for ultrahigh-dimensional varying coefficient models. J. Statist. Plann. Inference, 197, 69–92 (2018)
Liang, H., Zou, G., Wan, A.T.K., et al.: Optimal weight choice for frequentist model average estimators. J. Amer. Statist. Assoc., 106, 1053–1066 (2011)
Lin, B, Wang, Q., Zhang, J., et al.: Stable prediction in high-dimensional linear models. Stat. Comput., 27, 1401–1412 (2017)
Lu, X., Su, L.: Jackknife model averaging for quantile regressions. J. Econometrics, 188, 40–58 (2015)
Ma, S., He, X.: Inference for single-index quantile regression models with profile optimization. Ann. Statist., 44, 1234–1268 (2016)
Ma, S., Li, R., Tsai, C.-L.: Variable screening via quantile partial correlation. J. Amer. Statist. Assoc., 112, 650–663 (2017)
Noh, H., Chung, K., Keilegom, I.: Variable selection of varying coefficient models in quantile regression. Electron. J. Stat., 6, 1220–1238 (2012)
Parzen, E.: On estimation of a probability density function and model. Ann. Statist., 33, 1065–1076 (1962)
Pollard, D.: Asymptotics for least absolute deviation regression estimators. Economet. Theor., 7, 186–199 (1991)
Sherwood, B., Wang, L.: Partially linear additive quantile regression in ultra-high dimension. Ann. Statist., 44, 288–317 (2016)
Tang, Y., Song, X., Wang, H., et al.: Variable selection in high-dimensional quantile varying coefficient models. J. Multivariate Anal., 122, 115–132 (2013)
Tao, P. D., An, L. T. H.: Convex analysis approach to d.c. programming: Theory, algorithms and applications. Acta Math. Vietnam., 22, 289–355 (1997)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol., 58, 267–288 (1996)
Wang, H., Xia, Y.: Shrinkage estimation of the varying coefficient model. J. Amer. Statist. Assoc., 104, 747–757 (2009)
Wang, L., Wu, Y., Li, R.: Quantile regression for analyzing heterogeneity in ultra-high dimension. J. Amer. Statist. Assoc., 107, 214–222 (2012)
Welsh, A. H.: On M-processes and M-estimation. Ann. Statist., 17, 337–361 (1989)
Xia, X., Li, J., Fu, B.: Conditional quantile correlation learning for ultrahigh dimensional varying coefficient models and its application in survival analysis. Statist. Sinica, 29, 645–669 (2019)
Xue, L., Qu, A.: Variable selection in high-dimensional varying-coefficient models with global optimality. J. Mach. Learn. Res., 13, 1973–1998 (2012)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B Stat. Methodol., 68, 49–67 (2006)
Zhang, C. H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Statist., 38, 894–942 (2010)
Zhao, W., Jiang, X., Lian, H.: A principal varying-coefficient model for quantile regression: Joint variable selection and dimension reduction. Comput. Statist. Data Anal., 127, 269–280 (2018)
Zhu, L. P., Li, L., Li, R., et al.: Model-free feature screening for ultrahigh-dimensional data. J. Amer. Statist. Assoc., 106, 1464–1475 (2011)
Zhu, R., Wan, T. K., Zhang, X., et al.: A Mallows-type model averaging estimator for the varying-coefficient partially linear model. J. Amer. Statist. Assoc., 114, 882–892 (2019)
Zou, H.: The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc., 101, 1418–1429 (2006)
Acknowledgements
We appreciate the constructive suggestions from the referees and the editors.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by the National Natural Science Foundation of China Grant (Grant No. 12201091), Natural Science Foundation of Chongqing Grant (Grant Nos. CSTB2022NSCQ-MSX0852, cstc2021jcyj-msxmX0502), Innovation Support Program for Chongqing Overseas Returnees (Grant No. cx2020025), Science and Technology Research Program of Chongqing Municipal Education Commission (Grant Nos. KJQN202100526, KJQN201900511), the National Statistical Science Research Program (Grant No. 2022LY019) and Chongqing University Innovation Research Group Project: Nonlinear Optimization Method and Its Application (Grant No. CXQT20014)
Rights and permissions
About this article
Cite this article
Guo, C.H., Lv, J., Yang, H. et al. Semiparametric Model Averaging for Ultrahigh-Dimensional Conditional Quantile Prediction. Acta. Math. Sin.-English Ser. 39, 1171–1202 (2023). https://doi.org/10.1007/s10114-023-0346-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10114-023-0346-4
Keywords
- Kernel regression
- model averaging
- oracle property
- penalized quantile regression
- ultrahigh-dimension data