Skip to main content
Log in

Statistical inference and large-scale multiple testing for high-dimensional regression models

  • Invited Paper
  • Published:
TEST Aims and scope Submit manuscript

A Discussion to this article was published on 18 December 2023

A Discussion to this article was published on 11 December 2023

A Discussion to this article was published on 07 November 2023

A Discussion to this article was published on 03 November 2023

Abstract

This paper presents a selective survey of recent developments in statistical inference and multiple testing for high-dimensional regression models, including linear and logistic regression. We examine the construction of confidence intervals and hypothesis tests for various low-dimensional objectives such as regression coefficients and linear and quadratic functionals. The key technique is to generate debiased and desparsified estimators for the targeted low-dimensional objectives and estimate their uncertainty. In addition to covering the motivations for and intuitions behind these statistical methods, we also discuss their optimality and adaptivity in the context of high-dimensional inference. In addition, we review the recent development of statistical inference based on multiple regression models and the advancement of large-scale multiple testing for high-dimensional regression. The R package SIHR has implemented some of the high-dimensional inference methods discussed in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Athey S, Imbens GW, Wager S (2018) Approximate residual balancing: debiased inference of average treatment effects in high dimensions. J R Stat Soc B 80(4):597–623

    Article  MathSciNet  Google Scholar 

  • Bach F (2010) Self-concordant analysis for logistic regression. Electron J Stat 4:384–414

    Article  MathSciNet  Google Scholar 

  • Barber RF, Candès EJ (2015) Controlling the false discovery rate via knockoffs. Ann Stat 43(5):2055–2085

    Article  MathSciNet  Google Scholar 

  • Barber RF, Candès EJ, Samworth RJ (2020) Robust inference with knockoffs. Ann Stat 48(3):1409–1431

    Article  MathSciNet  Google Scholar 

  • Battey H, Fan J, Liu H, Lu J, Zhu Z (2018) Distributed testing and estimation under sparse high dimensional models. Ann Stat 46(3):1352

    Article  MathSciNet  Google Scholar 

  • Bayati M, Montanari A (2011) The Lasso risk for gaussian matrices. IEEE Trans Inf Theory 58(4):1997–2017

    Article  MathSciNet  Google Scholar 

  • Bellec PC, Lecué G, Tsybakov AB (2018) Slope meets Lasso: improved oracle bounds and optimality. Ann Stat 46(6B):3603–3642

    Article  MathSciNet  Google Scholar 

  • Belloni A, Chernozhukov V, Wang L (2011) Square-root Lasso: pivotal recovery of sparse signals via conic programming. Biometrika 98(4):791–806

    Article  MathSciNet  Google Scholar 

  • Belloni A, Chernozhukov V, Hansen C (2014) Inference on treatment effects after selection among high-dimensional controls. Rev Econ Stud 81(2):608–650

    Article  MathSciNet  Google Scholar 

  • Belloni A, Chernozhukov V, Fernández-Val I, Hansen C (2017) Program evaluation and causal inference with high-dimensional data. Econometrica 85(1):233–298

    Article  MathSciNet  Google Scholar 

  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57:289–300

    MathSciNet  Google Scholar 

  • Benjamini Y, Hochberg Y (1997) Multiple hypotheses testing with weights. Scand J Stat 24(3):407–418

    Article  MathSciNet  Google Scholar 

  • Bickel PJ, Ritov Y, Tsybakov AB (2009) Simultaneous analysis of Lasso and dantzig selector. Ann Stat 37(4):1705–1732

    Article  MathSciNet  Google Scholar 

  • Bühlmann P, van de Geer S (2011) Statistics for high-dimensional data: methods, theory and applications. Springer, New York

    Book  Google Scholar 

  • Bunea F (2008) Honest variable selection in linear and logistic regression models via \(\ell _1\) and \(\ell _1\)+ \(\ell _2\) penalization. Electron J Stat 2:1153–1194

    Article  MathSciNet  Google Scholar 

  • Cai TT, Guo Z (2017) Confidence intervals for high-dimensional linear regression: minimax rates and adaptivity. Ann Stat 45(2):615–646

    Article  MathSciNet  Google Scholar 

  • Cai TT, Guo Z (2018a) Accuracy assessment for high-dimensional linear regression. Ann Stat 46(4):1807–1836

    Article  MathSciNet  Google Scholar 

  • Cai TT, Zhang L (2018b) High-dimensional gaussian copula regression: adaptive estimation and statistical inference. Stat Sin 2018:963–993

    MathSciNet  Google Scholar 

  • Cai TT, Guo Z (2020) Semisupervised inference for explained variance in high dimensional linear regression and its applications. J R Stat Soc B 82(2):391–419

    Article  MathSciNet  Google Scholar 

  • Cai TT, Li H, Ma J, Xia Y (2019) Differential Markov random field analysis with an application to detecting differential microbial community networks. Biometrika 106(2):401–416

    Article  MathSciNet  Google Scholar 

  • Cai TT, Guo Z, Ma R (2021a) Statistical inference for high-dimensional generalized linear models with binary outcomes. J Am Stat Assoc 116:1–14

    Google Scholar 

  • Cai T, Cai TT, Guo Z (2021b) Optimal statistical inference for individualized treatment effects in high-dimensional models. J R Stat Soc B 83(4):669–719

    Article  MathSciNet  Google Scholar 

  • Cai T, Liu M, Xia Y (2022) Individual data protected integrative regression analysis of high-dimensional heterogeneous data. J Am Stat Assoc 117(540):2105–2119

    Article  MathSciNet  Google Scholar 

  • Cai TT, Sun W, Xia Y (2022) LAWS: a locally adaptive weighting and screening approach to spatial multiple testing. J Am Stat Assoc 117:1370–1383

    Article  MathSciNet  Google Scholar 

  • Candes E, Tao T (2007) The dantzig selector: statistical estimation when \(p\) is much larger than \(n\). Ann Stat 35(6):2313–2351

    MathSciNet  Google Scholar 

  • Candes E, Fan Y, Janson L, Lv J (2018) Panning for gold:‘model-x’ knockoffs for high dimensional controlled variable selection. J R Stat Soc B 80(3):551–577

    Article  MathSciNet  Google Scholar 

  • Chakrabortty A, Cai T (2018) Efficient and adaptive linear regression in semi-supervised settings. Ann Stat 46(4):1541–1572

    Article  MathSciNet  Google Scholar 

  • Chen S, Banerjee A (2017) Alternating estimation for structured high-dimensional multi-response models. Advances in neural information processing systems 30

  • Chen Y, Fan J, Ma C, Yan Y (2019) Inference and uncertainty quantification for noisy matrix completion. Proc Natl Acad Sci 116(46):22931–22937

    Article  MathSciNet  Google Scholar 

  • Chernozhukov V, Hansen C, Spindler M (2015) Valid post-selection and post-regularization inference: an elementary, general approach. Annu Rev Econom 7(1):649–688

    Article  Google Scholar 

  • Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C, Newey W, Robins J (2018) Double/debiased machine learning for treatment and structural parameters: double/debiased machine learning. Econom J 21(1):1–68

    Article  MathSciNet  Google Scholar 

  • Collier O, Comminges L, Tsybakov AB (2017) Minimax estimation of linear and quadratic functionals on sparsity classes. Ann Stat 45(3):923–958

    Article  MathSciNet  Google Scholar 

  • Dai C, Lin B, Xing X, Liu JS (2023) A scale-free approach for false discovery rate control in generalized linear models. J Am Stat Assoc 2023:1–31

    MathSciNet  Google Scholar 

  • Deng S, Ning Y, Zhao J, Zhang H (2020) Optimal semi-supervised estimation and inference for high-dimensional linear regression. arXiv preprint arXiv:2011.14185

  • Deshpande Y, Mackey L, Syrgkanis V, Taddy M (2018) Accurate inference for adaptive linear models. In: International conference on machine learning. PMLR, pp 1194–1203

  • Dezeure R, Bühlmann P, Meier L, Meinshausen N (2015) High-dimensional inference: confidence intervals. \(p\)-values and R-software hdi. Stat Sci 533–558

  • Dezeure R, Bühlmann P, Zhang C-H (2017) High-dimensional simultaneous inference with the bootstrap. TEST 26(4):685–719

    Article  MathSciNet  Google Scholar 

  • Donoho DL, Maleki A, Montanari A (2011) The noise-sensitivity phase transition in compressed sensing. IEEE Trans Inf Theory 57(10):6920–6941

    Article  MathSciNet  Google Scholar 

  • Du L, Guo X, Sun W, Zou C (2023) False discovery rate control under general dependence by symmetrized data aggregation. J Am Stat Assoc 118 (541): 607–621

  • Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499

    Article  MathSciNet  Google Scholar 

  • Eftekhari H, Banerjee M, Ritov Y (2021) Inference in high-dimensional single-index models under symmetric designs. J Mach Learn Res 22:27–1

    MathSciNet  Google Scholar 

  • Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360

    Article  MathSciNet  Google Scholar 

  • Fang EX, Ning Y, Liu H (2017) Testing and confidence intervals for high dimensional proportional hazards models. J R Stat Soc B 79(5):1415–1437

    Article  MathSciNet  Google Scholar 

  • Fang EX, Ning Y, Li R (2020) Test of significance for high-dimensional longitudinal data. Ann Stat 48(5):2622

    Article  MathSciNet  Google Scholar 

  • Fan Q, Guo Z, Mei Z (2022) Testing overidentifying restrictions with high-dimensional data and heteroskedasticity. arXiv preprint arXiv:2205.00171

  • Farrell MH (2015) Robust inference on average treatment effects with possibly more covariates than observations. J Econom 189(1):1–23

    Article  MathSciNet  Google Scholar 

  • Fithian W, Lei L (2022) Conditional calibration for false discovery rate control under dependence. Ann Stat 50(6):3091–3118

    Article  MathSciNet  Google Scholar 

  • Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1

    Article  Google Scholar 

  • Genovese CR, Roeder K, Wasserman L (2006) False discovery control with \(p\)-value weighting. Biometrika 93(3):509–524

    Article  MathSciNet  Google Scholar 

  • Greenshtein E, Ritov Y (2004) Persistence in high-dimensional linear predictor selection and the virtue of overparametrization. Bernoulli 10(6):971–988

    Article  MathSciNet  Google Scholar 

  • Guo Z (2020) Statistical Inference for Maximin Effects: Identifying Stable Associations across Multiple Studies. J Am Stat Assoc, to appear

  • Guo Z, Kang H, Cai TT, Small DS (2018) Testing endogeneity with high dimensional covariates. J Econom 207(1):175–187

    Article  MathSciNet  Google Scholar 

  • Guo Z, Wang W, Cai TT, Li H (2019a) Optimal estimation of genetic relatedness in high-dimensional linear models. J Am Stat Assoc 114(525):358–369

    Article  MathSciNet  Google Scholar 

  • Guo Z, Yuan W, Zhang C-H (2019b) Decorrelated local linear estimator: Inference for non-linear effects in high-dimensional additive models. arXiv preprint arXiv:1907.12732

  • Guo Z, Rakshit P, Herman DS, Chen J (2021a) Inference for the case probability in high-dimensional logistic regression. J Mach Learn Res 22(1):11480–11533

    MathSciNet  Google Scholar 

  • Guo Z, Renaux C, Bühlmann P, Cai TT (2021b) Group inference in high dimensions with applications to hierarchical testing. Electron J Stat 15(2):6633–6676

    Article  MathSciNet  Google Scholar 

  • Guo Z, Ćevid D, Bühlmann P (2022) Doubly debiased Lasso: high-dimensional inference under hidden confounding. Ann Stat 50(3):1320–1347

    Article  MathSciNet  Google Scholar 

  • Guo Z, Li X, Han L, Cai T (2023) Robust inference for federated meta-learning. arXiv preprint arXiv:2301.00718

  • Hou J, Guo Z, Cai T (2021) Surrogate assisted semi-supervised inference for high dimensional risk prediction. arXiv preprint arXiv:2105.01264

  • Huang J, Zhang C-H (2012) Estimation and selection via absolute penalized convex minimization and its multistage adaptive applications. J Mach Learn Res 13(Jun):1839–1864

    MathSciNet  Google Scholar 

  • Hunter DJ (2005) Gene-environment interactions in human diseases. Nat Rev Genet 6(4):287–298

    Article  Google Scholar 

  • Ignatiadis N, Klaus B, Zaugg JB, Huber W (2016) Data-driven hypothesis weighting increases detection power in genome-scale multiple testing. Nat Methods 13(7):577–580

    Article  Google Scholar 

  • Javanmard A, Montanari A (2014) Confidence intervals and hypothesis testing for high-dimensional regression. J Mach Learn Res 15(1):2869–2909

    MathSciNet  Google Scholar 

  • Javanmard A, Montanari A (2018) Debiasing the Lasso: optimal sample size for gaussian designs. Ann Stat 46(6A):2593–2622

    Article  MathSciNet  Google Scholar 

  • Javanmard A, Lee JD (2020) A flexible framework for hypothesis testing in high dimensions. J R Stat Soc B 82(3):685–718

    Article  MathSciNet  Google Scholar 

  • Kim B, Liu S, Kolar M (2021) Two-sample inference for high-dimensional Markov networks. J R Stat Soc B

  • Lee JD, Liu Q, Sun Y, Taylor JE (2017) Communication-efficient sparse regression. J Mach Learn Res 18(1):115–144

    MathSciNet  Google Scholar 

  • Lei L, Fithian W (2018) Adapt: an interactive procedure for multiple testing with side information. J R Stat Soc B 80(4):649–679

    Article  MathSciNet  Google Scholar 

  • Li A, Barber RF (2019) Multiple testing with the structure-adaptive Benjamini–Hochberg algorithm. J R Stat Soc B 81(1):45–74

    Article  MathSciNet  Google Scholar 

  • Li S, Cai TT, Li H (2021a) Transfer learning for high-dimensional linear regression: prediction, estimation and minimax optimality. J R Stat Soc B 84(1):149–173

    Article  MathSciNet  Google Scholar 

  • Li S, Cai TT, Li H (2021b) Inference for high-dimensional linear mixed-effects models: a quasi-likelihood approach. J Am Stat Assoc 116:1–12

    Google Scholar 

  • Li S, Zhang L, Cai TT, Li H (2021c) Estimation and inference for high-dimensional generalized linear models with knowledge transfer. Technical Report

  • Liang Z, Cai TT, Sun W, Xia Y (2022) Locally adaptive transfer learning algorithms for large-scale multiple testing. arXiv preprint arXiv:2203.11461

  • Liu W (2013) Gaussian graphical model estimation with false discovery rate control. Ann Stat 41(6):2948–2978

    Article  MathSciNet  Google Scholar 

  • Liu W, Luo S (2014) Hypothesis testing for high-dimensional regression models. Technical report

  • Liu M, Xia Y, Cho K, Cai T (2021) Integrative high dimensional multiple testing with heterogeneity under data sharing constraints. J Mach Learn Res 22:126–1

    MathSciNet  Google Scholar 

  • Lounici K, Pontil M, van de Geer S, Tsybakov AB et al (2011) Oracle inequalities and optimal inference under group sparsity. Ann Stat 39(4):2164–2204

    Article  MathSciNet  Google Scholar 

  • Luo L, Han R, Lin Y, Huang J (2021) Statistical inference in high-dimensional generalized linear models with streaming data. arXiv preprint arXiv:2108.04437

  • Ma R, Tony Cai T, Li H (2021) Global and simultaneous hypothesis testing for high-dimensional logistic regression models. J Am Stat Assoc 116(534):984–998

    Article  MathSciNet  Google Scholar 

  • Ma R, Guo Z, Cai TT, Li H (2022) Statistical inference for genetic relatedness based on high-dimensional logistic regression. arXiv preprint arXiv:2202.10007

  • Mandozzi J, Bühlmann P (2016) Hierarchical testing in the high-dimensional setting with correlated variables. J Am Stat Assoc 111(513):331–343

    Article  MathSciNet  Google Scholar 

  • Meier L, van de Geer S, Bühlmann P (2008) The group Lasso for logistic regression. J R Stat Soc B 70(1):53–71

    Article  MathSciNet  Google Scholar 

  • Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the Lasso. Ann Stat 34(3):1436–1462

    Article  MathSciNet  Google Scholar 

  • Meinshausen N, Bühlmann P (2015) Maximin effects in inhomogeneous large-scale data. Ann Stat 43(4):1801–1830

    Article  MathSciNet  Google Scholar 

  • Negahban S, Yu B, Wainwright MJ, Ravikumar PK (2009) A unified framework for high-dimensional analysis of \( m \)-estimators with decomposable regularizers. In: Advances in neural information processing systems, pp 1348–1356

  • Neykov M, Ning Y, Liu JS, Liu H (2018) A unified theory of confidence regions and testing for high-dimensional estimating equations. Stat Sci 33(3):427–443

    Article  MathSciNet  Google Scholar 

  • Nickl R, van de Geer S (2013) Confidence sets in sparse regression. Ann Stat 41(6):2852–2876

    Article  MathSciNet  Google Scholar 

  • Ning Y, Liu H (2017) A general theory of hypothesis tests and confidence regions for sparse high dimensional models. Ann Stat 45(1):158–195

    Article  MathSciNet  Google Scholar 

  • Rakshit P, Cai TT, Guo Z (2021) SIHR: An R package for statistical inference in high-dimensional linear and logistic regression models. arXiv preprint arXiv:2109.03365

  • Ren Z, Barber RF (2022) Derandomized knockoffs: leveraging e-values for false discovery rate control. arXiv preprint arXiv:2205.15461

  • Ren Z, Sun T, Zhang C-H, Zhou HH (2015) Asymptotic normality and optimalities in estimation of large gaussian graphical models. Ann Stat 43(3):991–1026

    Article  MathSciNet  Google Scholar 

  • Ren Z, Zhang C-H, Zhou H (2016) Asymptotic normality in estimation of large ising graphical model. Unpublished Manuscript

  • Ren Z, Wei Y, Candès E (2021) Derandomizing knockoffs. J Am Stat Assoc 116:1–11

    Google Scholar 

  • Roeder K, Wasserman L (2009) Genome-wide significance levels and weighted hypothesis testing. Stat Sci 24(4):398

    Article  MathSciNet  Google Scholar 

  • Schifano L, Li ED, Christiani DC, Lin X (2013) Genome-wide association analysis for multiple continuous secondary phenotypes. Am J Hum Genet 2013:744–759

    Article  Google Scholar 

  • Shi C, Song R, Lu W, Li R (2021) Statistical inference for high-dimensional models via recursive online-score estimation. J Am Stat Assoc 116(535):1307–1318

    Article  MathSciNet  Google Scholar 

  • Storey JD (2002) A direct approach to false discovery rates. J R Stat Soc B 64(3):479–498

    Article  MathSciNet  Google Scholar 

  • Sun T, Zhang C-H (2012) Scaled sparse linear regression. Biometrika 101(2):269–284

    MathSciNet  Google Scholar 

  • Sun Y, Ma L, Xia Y (2022) A decorrelating and debiasing approach to simultaneous inference for high-dimensional confounded models. arXiv preprint arXiv:2208.08754

  • Tian Y, Feng Y (2022) Transfer learning under high-dimensional generalized linear models. J Am Stat Assoc 117:1–30

    Article  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc B 58(1):267–288

    MathSciNet  Google Scholar 

  • van de Geer SA, Bühlmann P (2009) On the conditions used to prove oracle results for the Lasso. Electron J Stat 3:1360–1392

    MathSciNet  Google Scholar 

  • van de Geer S, Bühlmann P, Ritov Y, Dezeure R (2014) On asymptotically optimal confidence regions and tests for high-dimensional models. Ann Stat 42(3):1166–1202

    MathSciNet  Google Scholar 

  • Vovk V, Wang R (2021) E-values: Calibration, combination and applications. Ann Stat 49(3):1736–1754

    Article  MathSciNet  Google Scholar 

  • Wainwright MJ (2009) Sharp thresholds for high-dimensional and noisy sparsity recovery using \(\ell _1\)-constrained quadratic programming (Lasso). IEEE Trans Inf Theory 55(5):2183–2202

    Article  Google Scholar 

  • Wang R, Ramdas A (2020) False discovery rate control with e-values. arXiv preprint arXiv:2009.02824

  • Xia Y, Li L (2017) Hypothesis testing of matrix graph model with application to brain connectivity analysis. Biometrics 73(3):780–791

    Article  MathSciNet  Google Scholar 

  • Xia Y, Li L (2019) Matrix graph hypothesis testing and application in brain connectivity alternation detection. Stat Sin 29(1):303–328

    MathSciNet  Google Scholar 

  • Xia Y, Cai T, Tony Cai T (2015) Testing differential networks with applications to the detection of gene-gene interactions. Biometrika 102(2):247–266

    Article  MathSciNet  Google Scholar 

  • Xia Y, Cai T, Tony Cai T (2018a) Multiple testing of submatrices of a precision matrix with applications to identification of between pathway interactions. J Am Stat Assoc 113(521):328–339

    Article  MathSciNet  Google Scholar 

  • Xia Y, Cai T, Tony Cai T (2018b) Two-sample tests for high-dimensional linear regression with an application to detecting interactions. Stat Sin 28:63–92

    MathSciNet  Google Scholar 

  • Xia Y, Cai TT, Li H (2018c) Joint testing and false discovery rate control in high-dimensional multivariate regression. Biometrika 105(2):249–269

    Article  MathSciNet  Google Scholar 

  • Xia Y, Cai TT, Sun W (2020) GAP: A General Framework for Information Pooling in Two-Sample Sparse Inference. J Am Stat Assoc 115(531):1236–1250

    Article  MathSciNet  Google Scholar 

  • Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc B 68(1):49–67

    Article  MathSciNet  Google Scholar 

  • Yu Y, Bradic J, Samworth RJ (2018) Confidence intervals for high-dimensional cox models. arXiv preprint arXiv:1803.01150

  • Zhang C-H (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38(2):894–942

    Article  MathSciNet  Google Scholar 

  • Zhang T (2011) Adaptive forward-backward greedy algorithm for learning sparse representations. IEEE Trans Inf Theory 57(7):4689–4708

    Article  MathSciNet  Google Scholar 

  • Zhang C-H, Zhang SS (2014) Confidence intervals for low dimensional parameters in high dimensional linear models. J R Stat Soc B 76(1):217–242

    Article  MathSciNet  Google Scholar 

  • Zhang X, Cheng G (2017) Simultaneous inference for high-dimensional linear models. J Am Stat Assoc 112(518):757–768

    Article  MathSciNet  Google Scholar 

  • Zhang A, Brown LD, Cai TT (2019) Semi-supervised inference: general theory and estimation of means. Ann Stat 47(5):2538–2566

    Article  MathSciNet  Google Scholar 

  • Zhang L, Ma R, Cai TT, Li H (2020) Estimation, confidence intervals, and large-scale hypotheses testing for high-dimensional mixed linear regression. arXiv preprint arXiv:2011.03598

  • Zhang Y, Chakrabortty A, Bradic J (2021) Double robust semi-supervised inference for the mean: Selection bias under mar labeling with decaying overlap. arXiv preprint arXiv:2104.06667

  • Zhao P, Yu B (2006) On model selection consistency of Lasso. J Mach Learn Res 7:2541–2563

    MathSciNet  Google Scholar 

  • Zhao T, Kolar M, Liu H (2014) A general framework for robust testing and confidence regions in high-dimensional quantile regression. arXiv preprint arXiv:1412.8724

  • Zhou JJ, Cho MH, Lange C, Lutz S, Silverman EK, Laird NM (2015) Integrating multiple correlated phenotypes for genetic association analysis by maximizing heritability. Hum Hered 79:93–104

    Article  Google Scholar 

  • Zhou RR, Wang L, Zhao SD (2020) Estimation and inference for the indirect effect in high-dimensional linear mediation models. Biometrika 107(3):573–589

    Article  MathSciNet  Google Scholar 

  • Zhu Y, Bradic J (2018) Linear hypothesis testing in dense high-dimensional linear models. J Am Stat Assoc 113(524):1583–1600

    Article  MathSciNet  Google Scholar 

  • Zou H (2006) The adaptive Lasso and its oracle properties. J Am Stat Assoc 101(476):1418–1429

    Article  MathSciNet  Google Scholar 

  • Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc B 67(2):301–320

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The research of Yin Xia was supported in part by NSFC Grant 12022103. The research of Tony Cai was supported in part by NSF Grant DMS-2015259 and NIH grant R01-GM129781.The research of Zijian Guo was partly supported by the NSF grants DMS 1811857 and 2015373 and NIH grants R01GM140463 and R01LM013614.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yin Xia.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cai, T.T., Guo, Z. & Xia, Y. Statistical inference and large-scale multiple testing for high-dimensional regression models. TEST 32, 1135–1171 (2023). https://doi.org/10.1007/s11749-023-00870-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11749-023-00870-1

Keywords

Mathematics Subject Classification

Navigation