Abstract
Estimating the prediction error is a common practice in the statistical literature. Under a linear regression model, lete be the conditional prediction error andê be its estimate. We use ρ(ê, e), the correlation coefficient betweene andê, to measure the performance of a particular estimation method. Reasons are given why correlation is chosen over the more popular mean squared error loss. The main results of this paper conclude that it is generally not possible to obtain good estimates of the prediction error. In particular, we show that ρ(ê, e)=O(n −1/2) whenn → ∞. When the sample size is small, we argue that high values of ρ(ê, e) can be achieved only when the residual error distribution has very heavy tails and when no outlier presents in the data. Finally, we show that in order for ρ(ê, e) to be bounded away from zero asymptotically,ê has to be biased.
Similar content being viewed by others
References
Bickel, P. and Zhang, P. (1991). Variable selection in non-parametric regression with categorical covariates,J. Amer. Statist. Assoc.,87, 90–97.
Breiman, L. and Freedman, D. (1983). How many variables should be entered in a regression equation,J. Amer. Statist. Assoc.,78, 131–136.
Chiu, S. T. and Marron, J. S. (1990). The negative correlations between data-determined bandwidths and the optimal bandwidth,Statist. Probab. Lett.,10, 173–180.
Härdle, W., Hall, P. and Marron, J. S. (1988). How far are automatically chosen regression smoothing parameters from their optimum?,J. Amer. Statist. Assoc.,83, 86–95.
Johnstone, I. (1988). On inadmissibility of some unbiased estimates of loss,Statistical Decision Theory and Related Topics IV (eds. S. S. Gupta and J. O. Berger), Vol. 1, 361–379, Springer, New York.
Johnstone, I. and Hall, P. (1991). Empirical functionals and efficient smoothing parameter selection, Tech. Report No. 373, Department of Statistics, Stanford University, California.
Linhart, H. and Zucchini, W. (1986).Model Selection, Wiley, New York.
Robinson, G. K. (1991). That BLUP is a good thing: the estimation of random effects (with discussions),Statist. Sci.,6, 15–51.
Author information
Authors and Affiliations
About this article
Cite this article
Zhang, P. On the estimation of prediction errors in linear regression models. Ann Inst Stat Math 45, 105–111 (1993). https://doi.org/10.1007/BF00773671
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF00773671