Sparse and debiased lasso estimation and inference for high-dimensional composite quantile regression with distributed data

Hou, Zhaohan; Ma, Wei; Wang, Lei

doi:10.1007/s11749-023-00875-w

Sparse and debiased lasso estimation and inference for high-dimensional composite quantile regression with distributed data

Original Paper
Published: 16 August 2023

Volume 32, pages 1230–1250, (2023)
Cite this article

TEST Aims and scope Submit manuscript

494 Accesses
Explore all metrics

Abstract

We consider the data are inherently distributed and focus on statistical learning in the presence of heavy-tailed and/or asymmetric errors. The composite quantile regression (CQR) estimator is a robust and efficient alternative to the ordinary least squares and single quantile regression estimators. Based on the aggregated and communication-efficient approaches, we propose two classes of sparse and debiased lasso CQR estimation and inference methods. Specifically, an aggregated \(\ell _1\)-penalized CQR estimator and a \(\ell _1\)-penalized communication-efficient CQR estimator are obtained firstly. To construct confidence intervals and make hypothesis testing, a unified debiasing framework based on smoothed decorrelated score equations is introduced to eliminate biases caused by lasso penalty. Finally, a hard-thresholding method is employed to ensure that the debiased lasso estimators are sparse. The convergence rates and asymptotic properties of the proposed estimators are established and their performance is evaluated through simulations and a real-world dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Simultaneous variable selection and parametric estimation for quantile regression

Article 16 July 2014

Adaptive sparse group LASSO in quantile regression

Article 29 July 2020

Two-Stage Online Debiased Lasso Estimation and Inference for High-Dimensional Quantile Regression with Streaming Data

Article 21 October 2023

Data availability

The data are available from http://archive.ics.uci.edu/ml/datasets/communities+and+crime+unnormalized or from the authors upon request.

Code Availability

All the simulations are implemented in R codes, which are available from the authors upon request.

References

Battey H, Fan J, Liu H, Lu J, Zhu Z (2018) Distributed testing and estimation under sparse high dimensional models. Ann Stat 46(3):1352–1382
Article MathSciNet Google Scholar
Belloni A, Chernozhukov V (2011) \(\ell _1\)-penalized quantile regression in high-dimensional sparse models. Ann Stat 39(1):82–130
Article Google Scholar
Bradic J, Kolar M (2017) Uniform inference for high-dimensional quantile regression: linear functionals and regression rank scores. arXiv preprint, arXiv:1702.06209
Chen X, Liu W, Zhang Y (2019) Quantile regression under memory constraint. Ann Stat 47(6):3244–3273
Article MathSciNet Google Scholar
Chen X, Xie MG (2014) A split-and-conquer approach for analysis of extraordinarily large data. Stat Sin 24(4):1655–1684
MathSciNet Google Scholar
Cheng C, Feng X, Huang J, Liu X (2022) Regularized projection score estimation of treatment effects in high-dimensional quantile regression. Stat Sin 32(1):23–41
MathSciNet Google Scholar
Di F, Wang L (2022) Multi-round smoothed composite quantile regression for distributed data. Ann Inst Stat Math 74:869–893
Article MathSciNet Google Scholar
Di F, Wang L, Lian H (2022) Communication-efficient estimation and inference for high-dimensional quantile regression based on smoothed decorrelated score. Stat Med 41(25):5084–5101
Article MathSciNet Google Scholar
Fan J, Guo Y, Wang K (2023) Communication efficient accurate statistical estimation. J Am Stat Assoc 118(542):1000–1010
Article MathSciNet Google Scholar
Fernandes M, Guerre E, Horta E (2021) Smoothing quantile regressions. J Bus Econ Stat 39(1):338–357
Article MathSciNet Google Scholar
Gu Y, Zou H (2020) Sparse composite quantile regression in ultrahigh dimensions with tuning parameter calibration. IEEE Trans Inf Theory 66(11):7132–7154
Article MathSciNet Google Scholar
Han D, Huang J, Lin Y, Shen G (2022) Robust post-selection inference of high-dimensional mean regression with heavy-tailed asymmetric or heteroskedastic errors. J Econom 230(2):416–431
Article MathSciNet Google Scholar
Javanmard A, Montanari A (2014) Confidence intervals and hypothesis testing for high-dimensional regression. J Mach Learn Res 15(1):2869–2909
MathSciNet Google Scholar
Jordan MI, Lee JD, Yang Y (2019) Communication-efficient distributed statistical inference. J Am Stat Assoc 114:668–681
Article MathSciNet Google Scholar
Jiang R, Yu K (2021) Smoothing quantile regression for a distributed system. Neurocomputing 466:311–326
Article Google Scholar
Jiang R, Hu X, Yu K, Qian W (2018) Composite quantile regression for massive datasets. Statistics 52(5):980–1004
Article MathSciNet Google Scholar
Lee JD, Liu Q, Sun Y, Taylor JE (2017) Communication-efficient sparse regression. J Mach Learn Res 18(1):115–144
MathSciNet Google Scholar
Moon H, Zhou WX (2022) High-dimensional composite quantile regression: optimal statistical guarantees and fast algorithms. arXiv preprint, arXiv:2208.09817
Ning Y, Liu H (2017) A general theory of hypothesis tests and confidence regions for sparse high dimensional models. Ann Stat 45(1):158–195
Article MathSciNet Google Scholar
Tan KM, Wang L, Zhou WX (2021) High-dimensional quantile regression: convolution smoothing and concave regularization. arXiv preprint, arXiv:2109.05640
Van de Geer S, Bühlmann P, Ritov YA, Dezeure R (2014) On asymptotically optimal confidence regions and tests for high-dimensional models. Ann Stat 42(3):1166–1202
MathSciNet Google Scholar
Volgushev S, Chao SK, Cheng G (2019) Distributed inference for quantile regression processes. Ann Stat 47(3):1634–1662
Article MathSciNet Google Scholar
Wang J, Kolar M, Srebro N, Zhang T (2017) Efficient distributed learning with sparsity. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 3636–3645
Wang K, Li S, Zhang B (2021) Robust communication-efficient distributed composite quantile regression and variable selection for massive data. Comput Stat Data Anal 161:107262
Article MathSciNet Google Scholar
Wang L, Lian H (2020) Communication-efficient estimation of high-dimensional quantile regression. Anal Appl 18(06):1057–1075
Article MathSciNet Google Scholar
Yang Y, Wang L (2023) Communication-efficient sparse composite quantile regression for distributed data. Metrika 86(3):261–283
Article MathSciNet Google Scholar
Zhang CG, Zhang SS (2014) Confidence intervals for low dimensional parameters in high dimensional linear models. J R Stat Soc Ser B Stat Methodol 76(1):217–242
Article MathSciNet Google Scholar
Zhao T, Kolar M, Liu H (2014) A general framework for robust testing and confidence regions in high-dimensional quantile regression. arXiv preprint, arXiv:1412.8724
Zhao W, Zhang F, Lian H (2020) Debiasing and distributed estimation for high-dimensional quantile regression. IEEE Trans Neural Netw Learn Syst 31(7):2569–2577
MathSciNet Google Scholar
Zou H, Yuan M (2008) Composite quantile regression and the oracle model selection theory. Ann Stat 36(3):1108–1126
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors would like to thank Editor, Associate Editor, and two anonymous referees for helpful comments and suggestions. Lei Wang’s research was supported by the Fundamental Research Funds for the Central Universities and the National Natural Science Foundation of China (12271272).

Funding

Wang’s research was supported by the National Natural Science Foundation of China (12271272), the Natural Science Foundation of Tianjin (18JCYBJC41100) and the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

School of Statistics and Data Science, KLMDASR, LEBPS and LPMC, Nankai University, Tianjin, China
Zhaohan Hou, Wei Ma & Lei Wang

Authors

Zhaohan Hou
View author publications
You can also search for this author in PubMed Google Scholar
Wei Ma
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

ZH: investigation, formal analysis, software; WM: investigation, formal analysis; LW: methodology, formal analysis, supervision, writing.

Corresponding author

Correspondence to Lei Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Hou, Z., Ma, W. & Wang, L. Sparse and debiased lasso estimation and inference for high-dimensional composite quantile regression with distributed data. TEST 32, 1230–1250 (2023). https://doi.org/10.1007/s11749-023-00875-w

Download citation

Received: 26 November 2022
Accepted: 04 June 2023
Published: 16 August 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11749-023-00875-w

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sparse and debiased lasso estimation and inference for high-dimensional composite quantile regression with distributed data

Abstract

Access this article

Similar content being viewed by others

Simultaneous variable selection and parametric estimation for quantile regression

Adaptive sparse group LASSO in quantile regression

Two-Stage Online Debiased Lasso Estimation and Inference for High-Dimensional Quantile Regression with Streaming Data

Data availability

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Sparse and debiased lasso estimation and inference for high-dimensional composite quantile regression with distributed data

Abstract

Access this article

Similar content being viewed by others

Simultaneous variable selection and parametric estimation for quantile regression

Adaptive sparse group LASSO in quantile regression

Two-Stage Online Debiased Lasso Estimation and Inference for High-Dimensional Quantile Regression with Streaming Data

Data availability

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation