Cross-validation of component models: A critical look at current methods

Bro, R.; Kjeldahl, K.; Smilde, A. K.; Kiers, H. A. L.

doi:10.1007/s00216-007-1790-1

Cross-validation of component models: A critical look at current methods

Review
Published: 24 January 2008

Volume 390, pages 1241–1251, (2008)
Cite this article

Analytical and Bioanalytical Chemistry Aims and scope Submit manuscript

R. Bro¹,
K. Kjeldahl¹,
A. K. Smilde² &
…
H. A. L. Kiers³

4685 Accesses
241 Citations
15 Altmetric
3 Mentions
Explore all metrics

Abstract

In regression, cross-validation is an effective and popular approach that is used to decide, for example, the number of underlying features, and to estimate the average prediction error. The basic principle of cross-validation is to leave out part of the data, build a model, and then predict the left-out samples. While such an approach can also be envisioned for component models such as principal component analysis (PCA), most current implementations do not comply with the essential requirement that the predictions should be independent of the entity being predicted. Further, these methods have not been properly reviewed in the literature. In this paper, we review the most commonly used generic PCA cross-validation schemes and assess how well they work in various scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Limitations of “Limitations of Bayesian Leave-one-out Cross-Validation for Model Selection”

Article Open access 30 November 2018

A Bayesian approach for comparing cross-validated algorithms on multiple data sets

Article 24 March 2015

Rejoinder: More Limitations of Bayesian Leave-One-Out Cross-Validation

Article Open access 15 January 2019

References

Mosier C (1951) Educ Psychol Meas 11:5–11
Article Google Scholar
Stone M (1974) J Roy Stat Soc B 36:111–148
Google Scholar
Geisser S (2000) Biometrika 61:101–107
Article Google Scholar
Allen D (1974) Technometrics 16:125–127
Article Google Scholar
Wold S (1976) Pattern Recogn 8:127–139
Google Scholar
Wold S (1978) Technometrics 20:397–405
Google Scholar
Eastment HT, Krzanowski WJ (1982) Technometrics 24:73–77
Google Scholar
Osten D (1988) J Chemom 2:39–48
Article Google Scholar
Louwerse D, Kiers H, Smilde A (1999) J Chemom 13:491–510
Article CAS Google Scholar
Martens H, Martens M (2001) Multivariate analysis of quality: an introduction. Wiley, Chichester, UK
Martens H, Næs T (1989) Multivariate calibration. Wiley, Chichester, UK
Wold H (1975) Quantitative sociology. In: Blalock H, Aganbegian A, Borodkin F, Boudon R, Capecchi V (eds) International perspectives on mathematical and statistical modeling. Academic Press, New York, pp 307–357
Krzanowski WJ (1983) J Stat Comput Simul 18:299–314
Article Google Scholar
Louwerse D, Kiers H, Smilde A (1997) Internal Report 8:1–6
Google Scholar
Wise B, Gallagher N, Bro R, Shaver J (2003) PLS Toolbox 3.0. Manson, WA
Wise B, Ricker N (1991) In: Najim K, Dufour E (eds) IFAC Symp on Advanced Control of Chemical Processes, Toulouse, France, 14–16 October 1991, pp 125–130
Dempster A, Laird N, Rubin D (1977) J Roy Stat Soc B 39:1–38
Google Scholar
Bro R (1998) Multi-way analysis in the food industry. Models, algorithms, and applications. Ph.D. Thesis, University of Amsterdam, Amsterdam (see http://www.models.life.ku.dk/research/theses. Accessed 2 Jan 2007)
Kiers H (1997) Psychometrika 62:251–266
Article Google Scholar
Bijlsma S, Boelens H, Smilde A (2001) Appl Spectrosc 55:77–83
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

Chemometrics Group, Faculty of Life Sciences, University of Copenhagen, 1958, Frederiksberg C, Denmark
R. Bro & K. Kjeldahl
Biosystems Data Analysis (BDA), Swammerdam Institute for Life Sciences, Nieuwe Achtergracht 166, 1018 WV, Amsterdam, The Netherlands
A. K. Smilde
Heymans Institute (DPMG), University of Groningen, Grote Kruisstraat 2/1, 9712 TS, Groningen, The Netherlands
H. A. L. Kiers

Authors

R. Bro
View author publications
You can also search for this author in PubMed Google Scholar
K. Kjeldahl
View author publications
You can also search for this author in PubMed Google Scholar
A. K. Smilde
View author publications
You can also search for this author in PubMed Google Scholar
H. A. L. Kiers
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to R. Bro.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bro, R., Kjeldahl, K., Smilde, A.K. et al. Cross-validation of component models: A critical look at current methods. Anal Bioanal Chem 390, 1241–1251 (2008). https://doi.org/10.1007/s00216-007-1790-1

Download citation

Received: 24 September 2007
Revised: 28 November 2007
Accepted: 04 December 2007
Published: 24 January 2008
Issue Date: March 2008
DOI: https://doi.org/10.1007/s00216-007-1790-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cross-validation of component models: A critical look at current methods

Abstract

Access this article

Similar content being viewed by others

Limitations of “Limitations of Bayesian Leave-one-out Cross-Validation for Model Selection”

A Bayesian approach for comparing cross-validated algorithms on multiple data sets

Rejoinder: More Limitations of Bayesian Leave-One-Out Cross-Validation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cross-validation of component models: A critical look at current methods

Abstract

Access this article

Similar content being viewed by others

Limitations of “Limitations of Bayesian Leave-one-out Cross-Validation for Model Selection”

A Bayesian approach for comparing cross-validated algorithms on multiple data sets

Rejoinder: More Limitations of Bayesian Leave-One-Out Cross-Validation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation