Skip to main content
Log in

Ridge regression and its degrees of freedom

  • Published:
Quality & Quantity Aims and scope Submit manuscript

Abstract

For ridge regression the degrees of freedom are commonly calculated by the trace of the matrix that transforms the vector of observations on the dependent variable into the ridge regression estimate of its expected value. For a fixed ridge parameter this is unobjectionable. When the ridge parameter is optimized on the same data, by minimization of the generalized cross validation criterion or Mallows \(\hbox {C}_{L}\), additional degrees of freedom are used however. We give formulae that take this into account. This allows of a proper assessment of ridge regression in competitions for the best predictor.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. If \(z\) is a Gausssian variable and \(f\left( z\right) \) a smooth function of \( z\), then the slope of the best linear fit (regression) to \(f\left( z\right) \) is the average derivative of \(f\left( z\right) \). This property characterizes the Gaussian distribution.

  2. See e.g. Dhrymes (1978), appendix 4, on matrix differentiation; \(vec\left( .\right) \) stacks the columns of the matrix argument in the natural order, and \(\otimes \) is the Kronecker product.

  3. I am grateful to the authors of the publications referred to for allowing me to use their data.

  4. Anand et al. (2009) suggest this split, which appears to be rather reasonable. The Economist of December 18th 2010 featured an article titled “The joy of growing old (or why life begins at 46)”. Reference is made to the ‘U-curve’ in reported well-being as a function of age that reaches a nadir at a global average of 46 years.

  5. This is the ‘leave-one-out’ version of cross-validation. See Stone’s comments on Shao (1997), and Burman (1989) for a justification to use leave-one-out instead of v-fold cross-validation.

References

  • Anand, P., Hunter, G., Dowding, K., Guala, F., van Hees, M.: The development of capability indicators. J. Hum. Dev. Capab. 10(1), 125–152 (2009)

    Google Scholar 

  • Berk, R.A.: Statistical Learning from a Regression Perspective. Springer, New York (2008)

    Google Scholar 

  • Burman, P.: A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika 76(3), 503–514 (1989)

    Article  Google Scholar 

  • Dijkstra, T. K.: Subjective well-being and its possible determinants. University of Groningen, Faculty of Economics and Business, working paper. www.rug.nl/staff/t.k.dijkstra/subjectivewell-being.pdf (2011)

  • Dhrymes, P.J.: Introductory Econometrics. Springer, San Francisco (1978)

    Book  Google Scholar 

  • Efron, B.: The estimation of prediction error: covariance penalties and cross-validation. J. Am. Stat. Assoc. 467, 619–632 (2004)

    Article  Google Scholar 

  • Golub, G.H., Heath, M., Wahba, G.: Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21(2), 215–223 (1979)

    Article  Google Scholar 

  • Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, 2nd edn. Springer, New York (2009)

    Book  Google Scholar 

  • Izenman, A.J.: Modern Multivariate Statistical Techniques. Springer, New York (2008)

    Book  Google Scholar 

  • Li, K.-C.: Asymptotic optimality of \(C_{L}\) and generalized cross-validation in ridge regression with application to spline-smoothing. Ann. Stat. 14(3), 1101–1112 (1986)

    Article  Google Scholar 

  • McDonald, G.C., Schwing, R.C.: Instabilities of regression estimates relating air pollution to mortality. Technometrics 15(3), 463–481 (1973)

    Article  Google Scholar 

  • McDonald, G.C.: Ridge regression. Wiley Interdiscip. Rev. Comput. Stat. 1, 93–100 (2009)

    Article  Google Scholar 

  • McDonald, G.C.: Tracing ridge regression coefficients. Wiley Interdiscip. Rev. Comput. Stat. 2, 695–703 (2010)

    Article  Google Scholar 

  • Shao, J.: An asymptotic theory for linear model selection (with discussion). Stat. Sinica 7, 221–264 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Theo K. Dijkstra.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dijkstra, T.K. Ridge regression and its degrees of freedom. Qual Quant 48, 3185–3193 (2014). https://doi.org/10.1007/s11135-013-9949-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11135-013-9949-7

Keywords

Navigation