Skip to main content
Log in

On the estimation of prediction errors in linear regression models

  • Estimation
  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

Estimating the prediction error is a common practice in the statistical literature. Under a linear regression model, lete be the conditional prediction error andê be its estimate. We use ρ(ê, e), the correlation coefficient betweene andê, to measure the performance of a particular estimation method. Reasons are given why correlation is chosen over the more popular mean squared error loss. The main results of this paper conclude that it is generally not possible to obtain good estimates of the prediction error. In particular, we show that ρ(ê, e)=O(n −1/2) whenn → ∞. When the sample size is small, we argue that high values of ρ(ê, e) can be achieved only when the residual error distribution has very heavy tails and when no outlier presents in the data. Finally, we show that in order for ρ(ê, e) to be bounded away from zero asymptotically,ê has to be biased.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bickel, P. and Zhang, P. (1991). Variable selection in non-parametric regression with categorical covariates,J. Amer. Statist. Assoc.,87, 90–97.

    Google Scholar 

  • Breiman, L. and Freedman, D. (1983). How many variables should be entered in a regression equation,J. Amer. Statist. Assoc.,78, 131–136.

    Google Scholar 

  • Chiu, S. T. and Marron, J. S. (1990). The negative correlations between data-determined bandwidths and the optimal bandwidth,Statist. Probab. Lett.,10, 173–180.

    Google Scholar 

  • Härdle, W., Hall, P. and Marron, J. S. (1988). How far are automatically chosen regression smoothing parameters from their optimum?,J. Amer. Statist. Assoc.,83, 86–95.

    Google Scholar 

  • Johnstone, I. (1988). On inadmissibility of some unbiased estimates of loss,Statistical Decision Theory and Related Topics IV (eds. S. S. Gupta and J. O. Berger), Vol. 1, 361–379, Springer, New York.

    Google Scholar 

  • Johnstone, I. and Hall, P. (1991). Empirical functionals and efficient smoothing parameter selection, Tech. Report No. 373, Department of Statistics, Stanford University, California.

    Google Scholar 

  • Linhart, H. and Zucchini, W. (1986).Model Selection, Wiley, New York.

    Google Scholar 

  • Robinson, G. K. (1991). That BLUP is a good thing: the estimation of random effects (with discussions),Statist. Sci.,6, 15–51.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

About this article

Cite this article

Zhang, P. On the estimation of prediction errors in linear regression models. Ann Inst Stat Math 45, 105–111 (1993). https://doi.org/10.1007/BF00773671

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00773671

Key words and phrases

Navigation