High-Dimensional Regression Under Correlated Design: An Extensive Simulation Study

Ahmed, S. Ejaz; Kim, Hwanwoo; Yıldırım, Gökhan; Yüzbaşı, Bahadır

doi:10.1007/978-3-030-17519-1_11

S. Ejaz Ahmed⁴,
Hwanwoo Kim⁵,
Gökhan Yıldırım⁶ &
…
Bahadır Yüzbaşı⁷

Part of the book series: Contributions to Statistics ((CONTRIB.STAT.))

Included in the following conference series:

International Workshop on Matrices and Statistics

3 Citations

Abstract

Regression problems where the number of predictors, p, exceeds the number of responses, n, have become increasingly important in many diverse fields in the last couple of decades. In the classical case of “small p and large n,” the least squares estimator is a practical and effective tool for estimating the model parameters. However, in this so-called Big Data era, models have the characteristic that p is much larger than n. Statisticians have developed a number of regression techniques for dealing with such problems, such as the Lasso by Tibshirani (J R Stat Soc Ser B Stat Methodol 58:267–288, 1996), the SCAD by Fan and Li (J Am Stat Assoc 96(456):1348–1360, 2001), the LARS algorithm by Efron et al. (Ann Stat 32(2):407–499, 2004), the MCP estimator by Zhang (Ann Stat. 38:894–942, 2010), and a tuning-free regression algorithm by Chatterjee (High dimensional regression and matrix estimation without tuning parameters, 2015, https://arxiv.org/abs/1510.07294). In this paper, we investigate the relative performances of some of these methods for parameter estimation and variable selection through analyzing real and synthetic data sets. By an extensive Monte Carlo simulation study, we also compare the relative performance of proposed methods under correlated design matrix.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Comments on: Statistical inference and large-scale multiple testing for high-dimensional regression models

Article 07 November 2023

A Bayesian-motivated test for high-dimensional linear regression models with fixed design matrix

Article 22 January 2020

Sparse Covariance and Precision Random Design Regression

References

Ahmed, S. E. (2014). Penalty, shrinkage and pretest strategies: Variable selection and estimation. New York: Springer.
Book Google Scholar
Belloni, A., Chernozhukov, V., & Wang, L. (2011). Square-root lasso: pivotal recovery of sparse signals via conic programming. Biometrika, 98(4), 791–806.
Article MathSciNet Google Scholar
Bickel, P. J., Ritov, Y., & Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. The Annals of Statistics, 37, 1705–1732.
Article MathSciNet Google Scholar
Bühlmann, P., Kalisch, M., & Meier, L. (2014). High-dimensional statistics with a view towards applications in biology. Annual Review of Statistics and Its Applications, 1, 255–278.
Article Google Scholar
Bühlmann, P., & van de Geer, S. (2011). Statistics for high-dimensional data. Methods,theory and applications. Springer series in statistics. Heidelberg: Springer.
MATH Google Scholar
Chatterjee, S. (2015). High dimensional regression and matrix estimation without tuning parameters. https://arxiv.org/abs/1510.07294
Google Scholar
Chatterjee, S., & Jafarov, J. (2016). Prediction error of cross-validated Lasso. https://arxiv.org/abs/1502.06291
Google Scholar
Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, 32(2), 407–499.
Article MathSciNet Google Scholar
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360.
Article MathSciNet Google Scholar
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1.
Article Google Scholar
Frank, I., & Friedman, J. (1993). A statistical view of some chemometrics regression tools (with discussion). Technometrics, 35, 109–148 (1993).
Article Google Scholar
Fu, W. J. (1998). Penalized regressions: The bridge versus the lasso. Journal of Computational and Graphical Statistics, 7(3), 397–416.
MathSciNet Google Scholar
Greenshtein, E., & Ritov, Y. A. (2004). Persistence in high-dimensional linear predictor selection and the virtue of overparametrization. Bernoulli, 10(6), 971–988.
Article MathSciNet Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning. Data mining, inference, and prediction. Springer series in statistics (2nd ed.). New York: Springer.
MATH Google Scholar
Hastie, T., Wainwright, M., & Tibshirani, R. (2016). Statistical learning with sparsity: The lasso and generalizations. Boca Raton, FL: Chapman and Hall/CRC.
MATH Google Scholar
Hebiri, M., & Lederer, J. (2013). How correlations influence lasso prediction. IEEE Xplore: IEEE Transactions on Information Theory, 59(3), 1846–1854.
MathSciNet MATH Google Scholar
Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for non-orthogonal problems. Technometrics, 12, 69–82.
Article Google Scholar
Huang, J., Ma, S., & Zhang, C. H. (2008). Adaptive Lasso for sparse high-dimensional regression models. Statistica Sinica, 18, 1603–1618
MathSciNet MATH Google Scholar
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning with applications in R. Springer texts in statistics. New York: Springer.
Book Google Scholar
Knight, K., & Fu, W. (2000). Asymptotics for lasso-type estimators. The Annals of Statistics, 28, 1356–1378.
Article MathSciNet Google Scholar
Lederer, J., & Muller, C. (2014). Don’t fall for tuning parameters: Tuning-free variable selection in high dimensions with the TREX. Preprint. arXiv:1404.0541.
Google Scholar
Leng, C., Lin, Y., & Wahba, G. (2006). A note on the lasso and related procedures in model selection. Statistica Sinica, 16, 1273–1284.
MathSciNet MATH Google Scholar
Meinshausen, N., & Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. The Annals of Statistics, 34(3), 1436–1462.
Article MathSciNet Google Scholar
Reid, S., Tibshirani, R., & Friedman, J. (2016). A study of error variance estimation in lasso regression. Statistica Sinica, 26(1), 35–67.
MathSciNet MATH Google Scholar
Stamey, T. A., Kabalin, J. N., McNeal, J. E., Johnstone, I. M., Freiha, F., Redwine, E. A., et al. (1989). Prostate specific antigen in the diagnosis and treatment of adenocarcinoma of the prostate. II. Radical prostatectomy treated patients. The Journal of Urology, 141(5), 1076–1083.
Article Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B Statistical Methodology, 58, 267–288.
MathSciNet MATH Google Scholar
Tibshirani, R. (2013). The lasso problem and uniqueness. Electronic Journal of Statistics, 7, 1456–1490
Article MathSciNet Google Scholar
Zhang, C. H. (2007). Penalized linear unbiased selection. Department of Statistics and Bioinformatics, Rutgers University, 3
Google Scholar
Zhang, C. H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38, 894–942.
Article MathSciNet Google Scholar
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429.
Article MathSciNet Google Scholar

Download references

Acknowledgements

The research of S. Ejaz Ahmed is supported by the Natural Sciences and the Engineering Research Council of Canada (NSERC).

Author information

Authors and Affiliations

Department of Mathematics and Statistics, Brock University, St. Catharines, ON, Canada
S. Ejaz Ahmed
Department of Statistics, University of Michigan, Ann Arbor, MI, USA
Hwanwoo Kim
Department of Mathematics, Bilkent University, Çankaya, Ankara, Turkey
Gökhan Yıldırım
Department of Econometrics, Inonu University, Malatya, Turkey
Bahadır Yüzbaşı

Authors

S. Ejaz Ahmed
View author publications
You can also search for this author in PubMed Google Scholar
Hwanwoo Kim
View author publications
You can also search for this author in PubMed Google Scholar
Gökhan Yıldırım
View author publications
You can also search for this author in PubMed Google Scholar
Bahadır Yüzbaşı
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. Ejaz Ahmed .

Editor information

Editors and Affiliations

Department of Mathematics and Statistics, Brock University, St. Catharines, ON, Canada
S. Ejaz Ahmed
Departmental Unit of Mathematics and Physics, Polytechnic Institute of Tomar, Tomar, Portugal
Francisco Carvalho
Faculty of Natural Sciences, University of Tampere, Tampere, Finland
Simo Puntanen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ahmed, S.E., Kim, H., Yıldırım, G., Yüzbaşı, B. (2019). High-Dimensional Regression Under Correlated Design: An Extensive Simulation Study. In: Ahmed, S., Carvalho, F., Puntanen, S. (eds) Matrices, Statistics and Big Data. IWMS 2016. Contributions to Statistics. Springer, Cham. https://doi.org/10.1007/978-3-030-17519-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-17519-1_11
Published: 02 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17518-4
Online ISBN: 978-3-030-17519-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

High-Dimensional Regression Under Correlated Design: An Extensive Simulation Study

Abstract

Access this chapter

Similar content being viewed by others

Comments on: Statistical inference and large-scale multiple testing for high-dimensional regression models

A Bayesian-motivated test for high-dimensional linear regression models with fixed design matrix

Sparse Covariance and Precision Random Design Regression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

High-Dimensional Regression Under Correlated Design: An Extensive Simulation Study

Abstract

Access this chapter

Similar content being viewed by others

Comments on: Statistical inference and large-scale multiple testing for high-dimensional regression models

A Bayesian-motivated test for high-dimensional linear regression models with fixed design matrix

Sparse Covariance and Precision Random Design Regression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation