High-dimensional simultaneous inference with the bootstrap

Dezeure, Ruben; Bühlmann, Peter; Zhang, Cun-Hui

doi:10.1007/s11749-017-0554-2

High-dimensional simultaneous inference with the bootstrap

Invited Paper
Published: 09 October 2017

Volume 26, pages 685–719, (2017)
Cite this article

TEST Aims and scope Submit manuscript

3067 Accesses
58 Citations
1 Altmetric
Explore all metrics

Abstract

We propose a residual and wild bootstrap methodology for individual and simultaneous inference in high-dimensional linear models with possibly non-Gaussian and heteroscedastic errors. We establish asymptotic consistency for simultaneous inference for parameters in groups G, where \(p \gg n\), \(s_0 = o(n^{1/2}/\{\log (p) \log (|G|)^{1/2}\})\) and \(\log (|G|) = o(n^{1/7})\), with p the number of variables, n the sample size and \(s_0\) the sparsity. The theory is complemented by many empirical results. Our proposed procedures are implemented in the R-package hdi (Meier et al. hdi: high-dimensional inference. R package version 0.1-6, 2016).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Small is beautiful: In defense of the small-N design

Article Open access 19 March 2018

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

References

Belloni A, Chernozhukov V, Chetverikov D, Wei Y (2015a) Uniformly valid post-regularization confidence regions for many functional parameters in z-estimation. Preprint arXiv:1512.07619
Belloni A, Chernozhukov V, Kato K (2015b) Uniform post-selection inference for least absolute deviation regression and other Z-estimation problems. Biometrika 102(1):77–94
Article MathSciNet MATH Google Scholar
Bickel P, Klaassen C, Ritov Y, Wellner J (1998) Efficient and adaptive estimation for semiparametric models. Springer, Berlin
MATH Google Scholar
Breiman L (1996) Heuristics of instability and stabilization in model selection. Ann Stat 24:2350–2383
Article MathSciNet MATH Google Scholar
Bühlmann P (2013) Statistical significance in high-dimensional linear models. Bernoulli 19:1212–1242
Article MathSciNet MATH Google Scholar
Bühlmann P, van de Geer S (2011) Statistics for high-dimensional data: methods, theory and applications. Springer, Berlin
Book MATH Google Scholar
Bühlmann P, van de Geer S (2015) High-dimensional inference in misspecified linear models. Electron J Stat 9:1449–1473
Article MathSciNet MATH Google Scholar
Bühlmann P, Kalisch M, Meier L (2014) High-dimensional statistics with a view towards applications in biology. Annu Rev Stat Appl 1:255–278
Article Google Scholar
Chatterjee A, Lahiri S (2011) Bootstrapping Lasso estimators. J Am Stat Assoc 106:608–625
Article MathSciNet MATH Google Scholar
Chatterjee A, Lahiri S (2013) Rates of convergence of the adaptive LASSO estimators to the oracle distribution and higher order refinements by the bootstrap. Ann Stat 41:1232–1259
Article MathSciNet MATH Google Scholar
Chernozhukov V, Chetverikov D, Kato K (2013) Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann Stat 41:2786–2819
Article MathSciNet MATH Google Scholar
Chernozhukov V, Chetverikov D, Kato K (2014) Central limit theorems and bootstrap in high dimensions. The Annals of Probabiliy, To appear, Preprint arXiv:1412.3661
MATH Google Scholar
Chernozhukov V, Hansen C, Spindler M (2016) hdm: high-dimensional metrics. Preprint arXiv:1608.00354
Deng H, Zhang C-H (2017) Beyond Gaussian approximation: bootstrap in large scale simultaneous inference. unpublished work in progress
Dezeure R, Bühlmann P, Meier L, Meinshausen N (2015) High-dimensional inference: confidence intervals, \(p\)-values and R-software hdi. Stat Sci 30:533–558
Article MathSciNet Google Scholar
Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26
Article MathSciNet MATH Google Scholar
Eicker F (1967) Limit theorems for regressions with unequal and dependent errors. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1, pp 59–82
Foygel Barber R, Candès EJ (2015) Controlling the false discovery rate via knockoffs. Ann Stat 43:2055–2085
Article MathSciNet MATH Google Scholar
Freedman DA (1981) Bootstrapping regression models. Ann Stat 9:1218–1228
Article MathSciNet MATH Google Scholar
Giné E, Zinn J (1989) Necessary conditions for the bootstrap of the mean. Ann Stat 17:684–691
Article MathSciNet MATH Google Scholar
Giné E, Zinn J (1990) Bootstrapping general empirical measures. Ann Probab 18:851–869
Article MathSciNet MATH Google Scholar
Hall P, Wilson SR (1991) Two guidelines for bootstrap hypothesis testing. Biometrics 47:757–762
Article MathSciNet Google Scholar
Huber PJ (1967) The behavior of maximum likelihood estimates under nonstandard conditions. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1, pp 221–233
Javanmard A, Montanari A (2014) Confidence intervals and hypothesis testing for high-dimensional regression. J Mach Learn Res 15:2869–2909
MathSciNet MATH Google Scholar
Liu RY, Singh K (1992) Efficiency and robustness in resampling. Ann Stat 20:370–384
Article MathSciNet MATH Google Scholar
Liu H, Yu B (2013) Asymptotic properties of lasso+mls and lasso+ridge in sparse high-dimensional linear regression. Electron J Stat 7:3124–3169
Article MathSciNet MATH Google Scholar
Mammen E (1993) Bootstrap and wild bootstrap for high dimensional linear models. Ann Stat 21:255–285
Article MathSciNet MATH Google Scholar
McKeague IW, Qian M (2015) An adaptive resampling test for detecting the presence of significant predictors. J Am Stat Assoc 110:1422–1433
Article MathSciNet MATH Google Scholar
Meier L, Dezeure R, Meinshausen N, Mächler M, Bühlmann P (2016) hdi: high-dimensional inference. R package version 0.1-6
Meinshausen N (2015) Group bound: confidence intervals for groups of variables in sparse high dimensional regression without assumptions on the design. J R Stat Soc B 77:923–945
Article MathSciNet Google Scholar
Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the Lasso. Ann Stat 34:1436–1462
Article MathSciNet MATH Google Scholar
Meinshausen N, Bühlmann P (2010) Stability selection (with discussion). J R Stat Soc B 72:417–473
Article MathSciNet Google Scholar
Meinshausen N, Meier L, Bühlmann P (2009) P-values for high-dimensional regression. J Am Stat Assoc 104:1671–1681
Article MathSciNet MATH Google Scholar
Meinshausen N, Maathuis MH, Bühlmann P (2011) Asymptotic optimality of the Westfall-Young permutation procedure for multiple testing under dependence. Ann Stat 39:3369–3391
Article MathSciNet MATH Google Scholar
Reid S, Tibshirani R, Friedman J (2016) A study of error variance estimation in Lasso regression. Stat Sinica 26:35–67
MathSciNet MATH Google Scholar
Rudelson M, Zhou S (2013) Reconstruction from anisotropic random measurements. IEEE Trans Inf Theory 59:3434–3447
Article MathSciNet MATH Google Scholar
Shah R, Samworth R (2013) Variable selection with error control: another look at stability selection. J R Stat Soc B 75:55–80
Article MathSciNet Google Scholar
Shah R, Bühlmann P (2015) Goodness of fit tests for high-dimensional linear models. J R Stat Soc B. doi:10.1111/rssb.12234
van de Geer S, Bühlmann P, Zhou S (2011) The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso). Electron J Stat 5:688–749
Article MathSciNet MATH Google Scholar
van de Geer S, Bühlmann P, Ritov Y, Dezeure R (2014) On asymptotically optimal confidence regions and tests for high-dimensional models. Ann Stat 42:1166–1202
Article MathSciNet MATH Google Scholar
Wasserman L, Roeder K (2009) High dimensional variable selection. Ann Stat 37:2178–2201
Article MathSciNet MATH Google Scholar
Westfall P, Young S (1993) Resampling-based multiple testing: examples and methods for P-value adjustment. Wiley, Hoboken
MATH Google Scholar
White H (1980) A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48:817–838
Article MathSciNet MATH Google Scholar
Wu C-FJ (1986) Jackknife, bootstrap and other resampling methods in regression analysis. Ann Stat 14:1261–1295
Article MathSciNet MATH Google Scholar
Ye F, Zhang C-H (2010) Rate minimaxity of the Lasso and Dantzig selector for the \(\ell _q\) loss in \(\ell _r\) balls. J Mach Learn Res 11:3481–3502
MathSciNet Google Scholar
Zhang C-H, Huang J (2008) The sparsity and bias of the Lasso selection in high-dimensional linear regression. Ann Stat 36:1567–1594
Article MathSciNet MATH Google Scholar
Zhang C-H, Zhang SS (2014) Confidence intervals for low dimensional parameters in high dimensional linear models. J R Stat Soc B 76:217–242
Article MathSciNet Google Scholar
Zhang X, Cheng G (2016) Simultaneous inference for high-dimensional linear models. J Am Stat Assoc. doi:10.1080/01621459.2016.1166114
Google Scholar
Zhou Q (2014) Monte Carlo simulation for Lasso-type problems by estimator augmentation. J Am Stat Assoc 109:1495–1516
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We gratefully acknowledge visits at the American Institute of Mathematics (AIM), San Jose, USA, and at the Mathematisches Forschungsinstitut (MFO), Oberwolfach, Germany. We also thank anonymous reviewers for constructive comments.

Author information

Authors and Affiliations

Seminar for Statistics, ETH Zürich, HG G 17, Rämistrasse 101, 8092, Zurich, Switzerland
Ruben Dezeure & Peter Bühlmann
Department of Statistics and Biostatistics, Rutgers University, 569 Hill Center, Busch Campus, Piscataway, NJ, 08854-8019, USA
Cun-Hui Zhang

Authors

Ruben Dezeure
View author publications
You can also search for this author in PubMed Google Scholar
Peter Bühlmann
View author publications
You can also search for this author in PubMed Google Scholar
Cun-Hui Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peter Bühlmann.

Additional information

Ruben Dezeure is partially supported by the Swiss National Science Foundation SNF 2-77991-14. Cun-Hui Zhang is partially supported by NSF Grants DMS-12-09014 and DMS-15-13378 and NSA Grant H98230-15-1-0040.

This invited paper is discussed in comments available at: doi:10.1007/s11749-017-0555-1; doi:10.1007/s11749-017-0556-0; doi:10.1007/s11749-017-0557-z; doi:10.1007/s11749-017-0558-y; doi:10.1007/s11749-017-0559-x.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 510 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dezeure, R., Bühlmann, P. & Zhang, CH. High-dimensional simultaneous inference with the bootstrap. TEST 26, 685–719 (2017). https://doi.org/10.1007/s11749-017-0554-2

Download citation

Published: 09 October 2017
Issue Date: December 2017
DOI: https://doi.org/10.1007/s11749-017-0554-2

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-dimensional simultaneous inference with the bootstrap

Abstract

Access this article

Similar content being viewed by others

Small is beautiful: In defense of the small-N design

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (pdf 510 KB)

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

High-dimensional simultaneous inference with the bootstrap

Abstract

Access this article

Similar content being viewed by others

Small is beautiful: In defense of the small-N design

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (pdf 510 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation