Forward variable selection for ultra-high dimensional quantile regression models

Honda, Toshio; Lin, Chien-Tong

doi:10.1007/s10463-022-00849-z

Forward variable selection for ultra-high dimensional quantile regression models

Published: 29 August 2022

Volume 75, pages 393–424, (2023)
Cite this article

Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Toshio Honda¹ &
Chien-Tong Lin²

591 Accesses
2 Citations
Explore all metrics

Abstract

We propose forward variable selection procedures with a stopping rule for feature screening in ultra-high-dimensional quantile regression models. For such very large models, penalized methods do not work and some preliminary feature screening is necessary. We demonstrate the desirable theoretical properties of our forward procedures by taking care of uniformity w.r.t. subsets of covariates properly. The necessity of such uniformity is often overlooked in the literature. Our stopping rule suitably incorporates the model size at each stage. We also present the results of simulation studies and a real data application to show their good finite sample performances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Simultaneous variable selection and parametric estimation for quantile regression

Article 16 July 2014

Variable selection in censored quantile regression with high dimensional data

Article 04 August 2017

A tuning-free efficient test for marginal linear effects in high-dimensional quantile regression

Article 18 July 2023

References

Barut, E., Fan, J., Verhasselt, A. (2016). Conditional sure independence screening. Journal of the American Statistical Association, 111, 1266–1277.
Article MathSciNet Google Scholar
Belloni, A., Chernozhukov, V. (2011). \(\ell\)1-penalized quantile regression in high-dimensional sparse models. The Annals of Statistics, 39, 82–130.
Article MathSciNet MATH Google Scholar
Bühlmann, P., van de Geer, S. (2011). Statistics for high-dimensional data: Methods, theory and applications. New York: Springer.
Bühlmann, P., Kalisch, M., Meier, L. (2014). High-dimensional statistics with a view toward applications in biology. Annual Review of Statistics and Its Application, 1, 255–278.
Article Google Scholar
Chen, J., Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces. Biometrika, 95, 759–771.
Article MathSciNet MATH Google Scholar
Chen, J., Chen, Z. (2012). Extended BIC for small-n-large-P sparse GLM. Statistica Sinica, 22, 555–574.
Article MathSciNet MATH Google Scholar
Cheng, M. Y., Honda, T., Zhang, J. T. (2016). Forward variable selection for sparse ultra-high dimensional varying coefficient models. Journal of the American Statistical Association, 111, 1209–1221.
Article MathSciNet Google Scholar
Das, D., Gregory, K., Lahiri, S. N. (2019). Perturbation bootstrap in adaptive lasso. The Annals of Statistics, 47, 2080–2116.
Article MathSciNet MATH Google Scholar
Fan, J., Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 95, 1348–1360.
Article MathSciNet MATH Google Scholar
Fan, J., Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. Journal of the Royal Statistical Society: Series B, 70, 849–911.
Article MathSciNet MATH Google Scholar
Fan, J., Song, R. (2010). Sure independence screening in generalized linear models with NP-dimensionality. The Annals of Statistics, 38, 3567–3604.
Article MathSciNet MATH Google Scholar
Fan, J., Fan, Y., Barut, E. (2014). Adaptive robust variable selection. The Annals of Statistics, 42, 324–351.
Article MathSciNet MATH Google Scholar
Fan, J., Li, R., Zhang, C. H., Zou, H. (2020). Statistical foundations of data science. Boca Raton: Chapman and Hall/CRC.
Hastie, T., Tibshirani, R., Wainwright, M. (2015). Statistical learning with sparsity: The lasso and generalizations. Boca Raton: Chapman & Hall/CRC.
He, X., Wang, L., Hong, H. G. (2013). Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data. The Annals of Statistics, 41, 342–369.
MathSciNet MATH Google Scholar
Honda, T., Lin, C. T. (2021). Forward variable selection for sparse ultra-high-dimensional generalized varying coefficient models. Japanese Journal of Statistics and Data Science, 4, 151–179.
Article MathSciNet MATH Google Scholar
Honda, T., Ing, C. K., Wu, W. Y. (2019). Adaptively weighted group Lasso for semiparametric quantile regression models. Bernoulli, 25, 3311–3338.
Article MathSciNet MATH Google Scholar
Ing, C. K., Lai, T. L. (2011). A stepwise regression method and consistent model selection for high-dimensional sparse linear models. Statistica Sinica, 21, 1473–1513.
Article MathSciNet MATH Google Scholar
Koenker, R. (2005). Quantile regression. New York: Cambridge University Press.
Book MATH Google Scholar
Koenker, R. (2021). quantreg: Quantile regression. R Package version 5.86. https://cran.r-project.org/web/packages/quantreg/index.html.
Koenker, R., Basset, G. (1978). Regression quantiles. Econometrica, 46, 33–50.
Article MathSciNet MATH Google Scholar
Kong, Y., Li, Y., Zerom, D. (2019). Screening and selection for quantile regression using an alternative measure of variable importance. Journal of Multivariate Analysis, 173, 435–455.
Article MathSciNet MATH Google Scholar
Lee, E. R., Noh, H., Park, B. U. (2014). Model selection via Bayesian information criterion for quantile regression models. Journal of the American Statistical Association, 109, 216–229.
Article MathSciNet MATH Google Scholar
Lin, C. T., Cheng, Y. J., Ing, C. K. (2022). Greedy variable selection for high-dimensional Cox models. Statistica Sinica, 34.
Liu, J., Zhong, W., Li, R. (2015). A selective overview of feature screening for ultrahigh-dimensional data. Science China Mathematics, 58, 1–22.
Article MathSciNet MATH Google Scholar
Luo, S., Chen, Z. (2014). Sequential Lasso cum EBIC for feature selection with ultra-high dimensional feature space. Journal of the American Statistical Association, 109, 1229–1240.
Article MathSciNet MATH Google Scholar
Pijyan, A., Zheng, Q., Hong, H. G., Li, Y. (2020). Consistent estimation of generalized linear models with high dimensional predictors via stepwise regression. Entropy, 22, 965.
Article MathSciNet Google Scholar
Sherwood, B., Maidman A. (2020). rqPen: Penalized quantile regression. R Package version 2.2.2. https://cran.r-project.org/web/packages/rqPen/index.html.
Sherwood, B., Wang, L. (2016). Partially linear additive quantile regression in ultra-high dimension. The Annals of Statistics, 44, 288–317.
Article MathSciNet MATH Google Scholar
Tang, Y., Wang, Y., Wang, H. J., Pan, Q. (2022). Conditional marginal test for high dimensional quantile regression. Statistica Sinica, 32, 869–892.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B, 58, 267–288.
MathSciNet MATH Google Scholar
van der Vaart, A. W., Wellner, J. A. (1996). Weak convergence and empirical processes. New York: Springer.
Wang, H. (2009). Forward regression for ultra-high dimensional variable screening. Journal of the American Statistical Association, 104, 1512–1524.
Article MathSciNet MATH Google Scholar
Wang, L., Wu, Y., Li, R. (2012). Quantile regression for analyzing heterogeneity in ultra-high dimension. Journal of the American Statistical Association, 107, 214–222.
Article MathSciNet MATH Google Scholar
Wu, Y., Yin, G. (2015). Conditional quantile screening in ultrahigh-dimensional heterogeneous data. Biometrika, 102, 65–76.
Article MathSciNet MATH Google Scholar
Zhang, C. H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38, 894–942.
Article MathSciNet MATH Google Scholar
Zheng, Q., Hong, H. G., Li, Y. (2020). Building generalized linear models with ultrahigh dimensional features: A sequentially conditional approach. Biometrics, 76, 47–60.
Article MathSciNet MATH Google Scholar
Zheng, Q., Peng, L., He, X. (2015). Globally adaptive quantile regression with ultra-high dimensional data. The Annals of Statistics, 43, 2225–2258.
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We appreciate valuable comments from the two reviewers very much. We also appreciate comments and help from Prof. Ching-Kang Ing very much. Honda’s research was supported in part by JSPS KAKENHI Grant Number JP 20K11705, Japan. Lin’s research was supported by grant 111-2118-M-035-007-MY2 from the National Science and Technology Council, Taiwan.

Author information

Authors and Affiliations

Graduate School of Economics, Hitotsubashi University, 2-1 Naka, Kunitachi, Tokyo, 186-8601, Japan
Toshio Honda
Department of Statistics, Feng Chia University, No.100, Wenhua Rd. Xitun Dist., Taichung City, 407102, Taiwan, ROC
Chien-Tong Lin

Authors

Toshio Honda
View author publications
You can also search for this author in PubMed Google Scholar
Chien-Tong Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Toshio Honda.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary file 1 (pdf 255 KB)

About this article

Cite this article

Honda, T., Lin, CT. Forward variable selection for ultra-high dimensional quantile regression models. Ann Inst Stat Math 75, 393–424 (2023). https://doi.org/10.1007/s10463-022-00849-z

Download citation

Received: 20 August 2021
Revised: 15 April 2022
Accepted: 19 July 2022
Published: 29 August 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s10463-022-00849-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Forward variable selection for ultra-high dimensional quantile regression models

Abstract

Access this article

Similar content being viewed by others

Simultaneous variable selection and parametric estimation for quantile regression

Variable selection in censored quantile regression with high dimensional data

A tuning-free efficient test for marginal linear effects in high-dimensional quantile regression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 255 KB)

About this article

Cite this article

Keywords

Navigation

Forward variable selection for ultra-high dimensional quantile regression models

Abstract

Access this article

Similar content being viewed by others

Simultaneous variable selection and parametric estimation for quantile regression

Variable selection in censored quantile regression with high dimensional data

A tuning-free efficient test for marginal linear effects in high-dimensional quantile regression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 255 KB)

About this article

Cite this article

Share this article

Keywords

Search

Navigation