Abstract
The cross-validation of principal components is a problem that occurs in many applications of statistics. The naive approach of omitting each observation in turn and repeating the principal component calculations is computationally costly. In this paper we present an efficient approach to leave-one-out cross-validation of principal components. This approach exploits the regular nature of leave-one-out principal component eigenvalue downdating. We derive influence statistics and consider the application to principal component regression.
Similar content being viewed by others
References
Aptech Systems, Inc. (1992) The GAUSS System Version 3.0. Aptech Systems, Inc., 23804 S.E. Kent-Kangley Road, Maple Valley, WA 98038.
Bunch, J. R. and Nielsen, C. P. (1978) Updating the singular value decomposition. Numerische Mathematik, 31, 111–129.
Bunch, J. R., Nielsen, C. P. and Sorensen, D. C. (1978) Rank-one modification of the symmetric eigenproblem. Numerische Mathematik, 31, 31–48.
Critchley, F. (1985) Influence in principal component analysis. Biometrika, 72(3), 627–636.
DeGroat, R. D. (1990) Efficient, numerically stabilized rank-one eigenstructure updating. IEEE Transactions on Acoustics, Speech and Signal Processing, 38(2), 301–316.
Dongarra, J. J. and Sorensen, D. C. (1987) A fully parallel algorithm for the symmetric eigenvalue problem. SIAM Journal on Scientific and Statistical Computing, 8, 139–154.
Fearn, T. (1983) A misuse of ridge regression in the calibration of a near infrared reflectance instrument. Applied Statistics, 32(1), 73–79.
Golub, G. H. (1973) Some modified matrix eigenvalue problems. SIAM Review, 15, 318–334.
Golub, G. H. and Van Loan, C. F. (1989) Matrix Computations (2nd edn), Johns Hopkins University Press, London.
Jolliffe, I. T. (1982) A note on the use of principal components in regression. Applied Statistics, 31(3), 300–302.
Jolliffe, I. T. (1986) Principal Component Anlaysis, Springer-Verlag, New York.
Martens, H. and Naes T. (1989) Multivariate Calibration, Wiley, New York.
Osborne, B. G., Fearn, T., Miller, A. R. and Douglas, S. (1984) Application of near infrared reflectance spectroscopy to the compositional analysis of biscuits and biscuit doughs. Journal of the Science of Food and Agriculture, 35, 99–105.
Pack, P., Jolliffe, I. T. and Morgan, B. J. T. (1988) Influential observations in principal component analysis: a case study. Journal of Applied Statistics, 15(1), 39–52.
Radhakrishnan, R. and Kshirsagar, A. M. (1981) Influence functions for certain parameters in multivariate analysis. Communications in Statistics, Theory and Methods, A10(6), 515–529.
Seber, G. A. F. (1984) Multivariate Observations, Wiley, New York.
Stone, M. (1974) Cross-validatory choice and assessment of statistical predictions (with discussion). Journal of the Royal Statistical Society, Series B, 36, 111–147; corrigendum, 38, 102 (1976).
Stone, M. and Brooks, R. J. (1990) Continuum regression: cross-validated sequentially constructed prediction embracing ordinary least squares, partial least squares and principal components regression (with discussion). Journal of the Royal Statistical Society, Series B, 52, 237–269.
Stone, M. and Brooks, R. J. (1992) Corrigendum: continuum regression: cross-validated sequentially constructed prediction embracing ordinary least squares, partial least squares and principal components regression. Journal of the Royal Statistical Society, Series B, 54, 906–907.
Sundberg, R. (1993) Continuum regression and ridge regression. Journal of the Royal Statistical Society, Series B, 55, 653–659.
Thompson, R. C. (1976) The behaviour of eigenvalues and singular values under perturbations of restricted rank. Linear Algebra and Applications, 13, 69–78.
Wilkinson, J. H. (1965) The Algebraic Eigenvalue Problem, Clarendon Press, Oxford.
Wold, S. (1976) Pattern recognition by means of disjoint principal components models. Pattern Recognition, 8, 127–139.
Wold, S. (1978) Cross-validatory estimation of the number of components in factor and principal components models. Technometrics, 20(4), 397–405.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Mertens, B., Fearn, T. & Thompson, M. The efficient cross-validation of principal components applied to principal component regression. Stat Comput 5, 227–235 (1995). https://doi.org/10.1007/BF00142664
Issue Date:
DOI: https://doi.org/10.1007/BF00142664