Equivalence between adaptive Lasso and generalized ridge estimators in linear regression with orthogonal explanatory variables after optimizing regularization parameters

  • Mineaki OhishiEmail author
  • Hirokazu Yanagihara
  • Shuichi Kawano


In this paper, we deal with a penalized least-squares (PLS) method for a linear regression model with orthogonal explanatory variables. The used penalties are an adaptive Lasso (AL)-type \(\ell _1\) penalty (AL penalty) and a generalized ridge (GR)-type \(\ell _2\) penalty (GR penalty). Since the estimators obtained by minimizing the PLS methods strongly depend on the regularization parameters, we optimize them by a model selection criterion (MSC) minimization method. The estimators based on the AL penalty and the GR penalty have different properties, and it is universally recognized that these are completely different estimators. However, in this paper, we show an interesting result that the two estimators are exactly equal when the explanatory variables are orthogonal after optimizing the regularization parameters by the MSC minimization method.


Adaptive Lasso \(C_p\) criterion GCV criterion Generalized ridge regression GIC Linear regression Model selection criterion Optimization problem Regularization parameters Sparsity 



The authors thank Prof. Emer. Yasunori Fujikoshi, Dr. Shintaro Hashimoto, and Mr. Ryoya Oda of Hiroshima University and Dr. Tomoyuki Nakagawa of Tokyo University of Science, for helpful comments. Moreover, the authors also thank the associate editor and the reviewers for their valuable comments.


  1. Atkinson, A. C. (1980). A note on the generalized information criterion for choice of a model. Biometrika, 67, 413–418.CrossRefGoogle Scholar
  2. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning, 3, 1–122.MathSciNetCrossRefGoogle Scholar
  3. Craven, P., Wahba, G. (1979). Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik, 31, 377–403.MathSciNetCrossRefGoogle Scholar
  4. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, 32, 407–499.MathSciNetCrossRefGoogle Scholar
  5. Fan, J., Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348–1360.MathSciNetCrossRefGoogle Scholar
  6. Francis, K. C. H., David, I. W., Scott, D. F. (2015). Tuning parameter selection for the adaptive lasso using ERIC. Journal of the American Statistical Association, 110, 262–269.MathSciNetCrossRefGoogle Scholar
  7. Friedman, J. H., Hastie, T., Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33, 1–22.CrossRefGoogle Scholar
  8. Hagiwara, K. (2017). A scaling and non-negative garrote in soft-thresholding. IEICE Transactions on Information and Systems, 100, 2702–2710.CrossRefGoogle Scholar
  9. Hoerl, A. E., Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12, 55–67.CrossRefGoogle Scholar
  10. Jolliffe, I. T. (1982). A note on the use of principal components in regression. Journal of Applied Statistics, 31, 300–303.CrossRefGoogle Scholar
  11. Massy, W. F. (1965). Principal components regression in explanatory statistical research. Journal of the American Statistical Association, 60, 234–256.CrossRefGoogle Scholar
  12. Nagai, I., Yanagihara, H., Satoh, K. (2012). Optimization of ridge parameters in multivariate generalized ridge regression by plug-in methods. Hiroshima Mathematical Journal, 42, 301–324.MathSciNetCrossRefGoogle Scholar
  13. Nishii, R. (1984). Asymptotic properties of criteria for selection of variables in multiple regression. The Annals of Statistics, 12, 758–765.MathSciNetCrossRefGoogle Scholar
  14. Ohishi, M., Yanagihara, H. (2017). Minimization algorithm of model selection criterion for optimizing tuning parameter in Lasso estimator when explanatory variables are orthogonal. RIMSKôkyûroku, 2047, 124–140. (in Japanese).Google Scholar
  15. Ohishi, M., Yanagihara, H., Fujikoshi, Y. (2020). A fast algorithm for optimizing ridge parameters in a generalized ridge regression by minimizing a model selection criterion. Journal of Statistical Planning and Inference, 204, 187–205.MathSciNetCrossRefGoogle Scholar
  16. Sun, W., Wang, J., Fang, Y. (2013). Consistent selection of tuning parameters via variable selection stability. Journal of Machine Learning Research, 14, 3419–3440.MathSciNetzbMATHGoogle Scholar
  17. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B (Methodological), 58, 267–288.MathSciNetCrossRefGoogle Scholar
  18. Yanagihara, H. (2012). A non-iterative optimization method for smoothness in penalized spline regression. Statistics and Computing, 22, 527–544.MathSciNetCrossRefGoogle Scholar
  19. Yanagihara, H. (2018). Explicit solution to the minimization problem of generalized cross-validation criterion for selecting ridge parameters in generalized ridge regression. Hiroshima Mathematical Journal, 48, 203–222.MathSciNetCrossRefGoogle Scholar
  20. Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 1091, 1418–1429.MathSciNetCrossRefGoogle Scholar

Copyright information

© The Institute of Statistical Mathematics, Tokyo 2019

Authors and Affiliations

  • Mineaki Ohishi
    • 1
    Email author
  • Hirokazu Yanagihara
    • 1
  • Shuichi Kawano
    • 2
  1. 1.Department of Mathematics, Graduate School of ScienceHiroshima UniversityHigashi-HiroshimaJapan
  2. 2.Department of Computer and Network Engineering, Graduate School of Informatics and EngineeringThe University of Electro-CommunicationsChofuJapan

Personalised recommendations