On the use of crossvalidation to assess performance in multivariate prediction
 P. Jonathan,
 W. J. Krzanowski,
 W. V. McCarthy
 … show all 3 hide
Rent the article at a discount
Rent now* Final gross prices may vary according to local VAT.
Get AccessAbstract
We describe a Monte Carlo investigation of a number of variants of crossvalidation for the assessment of performance of predictive models, including different values of k in leavekout crossvalidation, and implementation either in a onedeep or a twodeep fashion. We assume an underlying linear model that is being fitted using either ridge regression or partial least squares, and vary a number of design factors such as sample size n relative to number of variables p, and error variance. The investigation encompasses both the nonsingular (i.e. n > p) and the singular (i.e. n ≤ p) cases. The latter is now common in areas such as chemometrics but has as yet received little rigorous investigation. Results of the experiments enable us to reach some definite conclusions and to make some practical recommendations.
 Altman, N., Leger, C. (1997) On the optimality of predictionbased selection criteria and the convergence rate of estimators. Journal of the Royal Statistical Society, Series B 59: pp. 205216
 Bellman, R.E. (1961) Adaptive Control Processes. Princeton University Press, Princeton NJ
 Breiman, L. (1996) Heuristics of instability and stabilization in model selection. Annals of Statistics 24: pp. 23502382
 Breiman, L., Friedman, J.H. (1997) Predicting multivariate responses in multiple linear regression. Journal of the Royal Statistical Society 59: pp. 354
 Brown, P.J. (1993) Measurement, Regression and Calibration. Clarendon Press, Oxford
 Efron, B. (1982) The Jackknife, the Bootstrap and Other Resampling Plans. CBSM38, SIAM, Philadelphia, Penn.
 Ganeshanandam, S., Krzanowski, W.J. (1989) On selecting variables and assessing their performance in linear discriminant analysis. Australian Journal of Statistics 31: pp. 433447
 Garthwaite, P. (1994) An interpretation of partial least squares. Journal of the American Statistical Association 89: pp. 122127
 Golub, G.H., Heath, M., Wahba, G. (1979) Generalized crossvalidation as a method for choosing a good ridge parameter. Technometrics 22: pp. 215223
 Hills, M. (1966) Allocation rules and their error rates. Journal of the Royal Statistical Society, Series B 28: pp. 131
 Krzanowski, W.J. (1995) Selection of variables, and assessment of their performance, in mixedvariable discriminant analysis. Computational Statistics and Data Analysis 19: pp. 419431
 Krzanowski, W.J., Jonathan, P., McCarthy, W.V., Thomas, M.R. (1995) Discriminant analysis with singular covariance matrices: methods and applications to spectroscopic data. Applied Statistics 44: pp. 101115
 Lachenbruch, P.A., Mickey, M.R. (1968) Estimation of error rates in discriminant analysis. Technometrics 10: pp. 111
 Mertens, B., Fearn, T., Thompson, M. (1995) The efficient crossvalidation of principal components applied to principal component regression. Statistics and Computing 5: pp. 227235
 Montgomery, D.C., Peck, E.A. (1982) Introduction to Linear Regression Analysis. John Wiley & Sons, New York
 Mosteller, F., Tukey, J.W. Data analysis, including statistics. In: Lindzey, G., Aronson, E. eds. (1968) Handbook of Social Psychology. AddisonWesley, Reading, Mass
 Rannar, S., Geladi, P., Lindgren, F., Wold, S. (1995) A PLS kernel algorithm for data sets with many variables and few objects. Journal of Chemometrics 9: pp. 459470
 Shao, J. (1993) Linear model selection by crossvalidation. Journal of the American Statistical Association 88: pp. 486494
 Stone, M. (1974) Crossvalidatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society, Series B 36: pp. 111147
 Wold, S. (1978) Crossvalidatory estimation of the number of components in factor and principal component models. Technometrics 20: pp. 397405
 Title
 On the use of crossvalidation to assess performance in multivariate prediction
 Journal

Statistics and Computing
Volume 10, Issue 3 , pp 209229
 Cover Date
 20000701
 DOI
 10.1023/A:1008987426876
 Print ISSN
 09603174
 Online ISSN
 15731375
 Publisher
 Kluwer Academic Publishers
 Additional Links
 Topics
 Keywords

 crossvalidation
 ridge regression
 partial least squares
 prediction
 assessment of predictive models
 Industry Sectors
 Authors

 P. Jonathan ^{(1)}
 W. J. Krzanowski ^{(2)}
 W. V. McCarthy ^{(1)}
 Author Affiliations

 1. Shell Research Ltd., Chester, CH1 3SH, UK
 2. School of Mathematical Sciences, University of Exeter, Laver Building, North Park Rd., Exeter, EX4 4QE, UK