Advertisement

Variable selection procedure from multiple testing

  • Baoxue Zhang
  • Guanghui Cheng
  • Chunming Zhang
  • Shurong Zheng
Articles
  • 5 Downloads

Abstract

Variable selection has played an important role in statistical learning and scientific discoveries during the past ten years, and multiple testing is a fundamental problem in statistical inference and also has wide application in many scientific fields. Significant advances have been achieved in both areas. This study attempts to find a connection between adaptive LASSO (least absolute shrinkage and selection operator) and multiple testing procedures in linear regression models. We also propose procedures based on multiple testing methods to select variables and control the selection error rate, i.e., the false discovery rate. Simulation studies demonstrate that the proposed methods show good performance relative to controlling the selection error rate under a wide range of settings.

Keywords

variable selection multiple testing adaptive LASSO false discovery rate linear regression 

MSC(2010)

35J60 35J70 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant Nos. 11671268, 11522105, and 11690012). The authors thank the reviewers for their constructive comments, which helped us improve this manuscript substantially.

References

  1. 1.
    Barber R F, Candes E. Controlling the false discovery rate via knockoffs. ArXiv:1404.5609, 2014MATHGoogle Scholar
  2. 2.
    Benjamini Y, Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J R Stat Soc Ser B Stat Methodol, 1995, 57: 289–300MathSciNetMATHGoogle Scholar
  3. 3.
    Buhlmann P, van de Geer S. Statistics for High-Dimensional Data: Methods, Theory and Applications. New York: Springer, 2011CrossRefMATHGoogle Scholar
  4. 4.
    Bunea F, Wegkamp M H, Auguste A. Consistent variable selection in high dimensional regression via multiple testing. J Statist Plann Inference, 2006, 136: 4349–4364MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Candes E J, Tao T. The Dantzig selector: Statistical estimation when p is much larger than n. Ann Statist, 2007, 35: 2313–2351MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Conlon E M, Liu X S, Lieb J D, et al. Integrating regulatory motif discovery and genome-wide expression analysis. Proc Natl Acad Sci USA, 2003, 100: 3339–3344CrossRefGoogle Scholar
  7. 7.
    Efron B. Correlation and large-scale simultaneous sigfinicance testing. J Amer Statist Assoc, 2007, 102: 93–103MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Efron B, Hastie T, Johnstone I, et al. Least angle regression. Ann Statist, 2004, 32: 407–489MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Fan J Q, Han X, Gu W J. Estimating false discovery proportion under arbitrary covariance testing. J Amer Statist Assoc, 2012, 107: 1019–1035MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Fan J Q, Li R Z. Variable selection via noncave penalized likelihood and its oracle properties. J Amer Statist Assoc, 2001, 96: 1348–1360MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Ferreira J, Zwinderman A. On the Benjamini-Hochberg method. Ann Statist, 2006, 34: 1827–1849MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Furmańczyk K. On some stepdown procedures with application to consistent variable selection in linear regression. Statistics, 2015, 49: 614–628MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Meinshausen N, Meier L, Buhlmann P. P-values for high-dimensional regression. J Amer Statist Assoc, 2009, 104: 1671–1681MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Storey J D. A direct approach to false discovery rates. J R Stat Soc Ser B Stat Methodol, 2002, 64: 479–498MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Storey J D, Taylor J E, Siegmund D. Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: A unified approach. J R Stat Soc Ser B Stat Methodol, 2004, 66: 187–205MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Tibshirani R. Regression shrinkage and selection via the LASSO. J R Stat Soc Ser B Stat Methodol, 1996, 58: 267–288MathSciNetMATHGoogle Scholar
  17. 17.
    Tibshirani R, Hoeing H, Tibshirani R. Nearly isotonic regression. Technometrics, 2011, 53: 54–61MathSciNetCrossRefGoogle Scholar
  18. 18.
    Wasserman L, Roeder K. High dimensional variable selection. Ann Statist, 2009, 37: 2178–2201MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B Stat Methodol, 2006, 68: 49–67MathSciNetCrossRefMATHGoogle Scholar
  20. 20.
    Yuan M, Lin Y. Model selection and estimation in the Gaussian graphical model. Biometrika, 2007, 94: 19–35MathSciNetCrossRefMATHGoogle Scholar
  21. 21.
    Zhang C M. Assessing mean and median filters in multiple testing for large-scale imaging data. TEST, 2014, 23: 51–71MathSciNetCrossRefMATHGoogle Scholar
  22. 22.
    Zhang C M, Fan J Q, Yu T. Multiple testing via FDRL for large-scale imaging data. Ann Statist, 2011, 39: 613–642MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Zou H, Hastie T. Regularizaition and variable selection via the elastic net. J R Stat Soc Ser B Stat Methodol, 2005, 67: 301–320MathSciNetCrossRefMATHGoogle Scholar
  24. 24.
    Zou H. The adaptive LASSO and its oracle properties. J Amer Statist Assoc, 2006, 476: 1418–1429MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Science China Press and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Baoxue Zhang
    • 1
  • Guanghui Cheng
    • 2
  • Chunming Zhang
    • 3
  • Shurong Zheng
    • 2
  1. 1.School of StatisticsCapital University of Economics and BusinessBeijingChina
  2. 2.School of Mathematics and Statistics and Key Laboratory of Applied Statistics of Ministry of EducationNortheast Normal UniversityChangchunChina
  3. 3.Department of StatisticsUniversity of Wisconsin-MadisonMadisonUSA

Personalised recommendations