Abstract
Chemometrics has achieved major recognition and progress in the analytical chemistry field. In the first part of this tutorial, major achievements and contributions of chemometrics to some of the more important stages of the analytical process, like experimental design, sampling, and data analysis (including data pretreatment and fusion), are summarised. The tutorial is intended to give a general updated overview of the chemometrics field to further contribute to its dissemination and promotion in analytical chemistry.
Similar content being viewed by others
References
Mandel J. Statistical methods in analytical chemistry. J Chem Educ. 1949;26:534–9.
Weber G. Enumeration of components in complex systems by fluorescence spectrophotometry. Nature. 1961;190:27–9.
Wallace RM. Analysis of absorption spectra by multicomponent systems. J Phys Chem. 1960;64:899–901.
Fisher RA. Statistical methods for research workers. Edinburgh: Oliver and Boyd; 1925.
Lindsay RK, Buchanan BG, Feigenbaum EA, Lederberg J. Applications of artificial intelligence for organic chemistry: the DENDRAL project. New York: McGraw-Hill; 1980.
Kowalski BR, Jurs PC, Isenhour TL, Reilly CN. Computerized learning machines applied to chemical problems-multicategory pattern classification by least squares. Anal Chem. 1969;41:695–700.
Wold S. Spline functions, a new tool in data-analysis. Kem Tidskr. 1972;3:34–7.
B.R. Kowalski (editor), Chemometrics, mathematics, and statistics in chemistry. NATO ASI Series C, Mathematical and Physical Sciences. Vol. 138 D., 1984, Reidel Publishing Company: Dordrecht.
D.L. Massart, B.G.M.Vandeginste, S.N.Deming, Y. Michotte and L.Kaufman. Chemometrics: a textbook., Elsevier, Data Handling in Science and Technology, Volume 2, Amsterdam 1988.
Brereton RG. Chemometrics for pattern recognition. Chichester: Wiley; 2009.
van der Greef J, Smilde AK. Symbiosis of chemometrics and metabolomics: past, present, and future. J Chemometrics. 2005;19:376–86.
Marini F, editor. Chemometrics in food chemistry. Amsterdam: Elsevier; 2013.
Fisher RA. The design of experiments. Edinburgh: Oliver and Boyd; 1935.
Box GEP, Hunter WG, Hunter JS. Statistics for experimenters. New York: Wiley; 1978.
Deming SN, Morgan SL. Experimental design: a chemometric approach. Amsterdam: Elsevier; 1987.
J.J.Jansen, H.C.J.Hoefslood, R.J.Lalmers, J. van der Greef, M.E. Tiemmerman, A.K. Smilde, Anova simultaneous component analysis, (ASCA): a new tool for analysing designed metabolomics data. Bioinformatics. 2005, 3043–3048.
Harrington PB, Viera NE, Espinoza J, Nien JK, Romero R, Lergeyt AL. Analysis of variance-principal component analysis: a soft tool for proteome discovery. Anal Chim Acta. 2005;544:118–27.
F.Marini, D. de Beer, E. Joubert and B. Walczak, Analysis of variance designed chromatographic data sets: the analysis of variance-target projection approach, J Chromatogr. 2015, 94–102.
Gy PM. Sampling for analytical purposes. The Netherlands: John Wiley and Sons; 1998.
Einax JW, Zwanziger HW, Geis S. Sampling and sampling design. In: Chemometrics in environmental analysis. Weinheim, FRG: Wiley-VCH Verlag GmbH & Co. KGaA; 1997. p. 95–137.
Esbensen KH, Geladi P. Principles of proper validation: use and abuse of re-sampling for validation. J Chemom. 2010;24:168–87.
Petersen L, Minkkinen P, Esbensen KH. Representative sampling for reliable data analysis: theory of sampling. Chemom Intell Lab Syst. 2005;77:261–77.
Petersen L, Esbensen KH. Sampling in practice: a tos toolbox of unit operations. In: Pomerantsev A, editor. Progress in chemometrics research. US: Nova Science Publishers; 2005.
G. Kateman Chemometrics—sampling strategies, pp. 43–62. In: Chemometrics and species identification, Topics in current chemistry, Vol.141, Springer Verlag, FRG, 1987.
Dardenne P, Sinnaeve G, Baeten V. Multivariate calibration and chemometrics for near infrared spectroscopy: which method? J Near Infrared Spectrosc. 2000;8:229–37.
Engel J, Gerretzen J, Szymanska E, Jansen J, Downey G, Blanchet L, et al. Breaking with trends in pre-processing? Trends Anal Chem. 2013;50:96–106.
Data preprocessing chapters in comprehensive chemometrics, Vol2, Section Ed. J.Trygg, General Ed. S.D. Brown, R.Tauler, B.Walczak, Elsevier, Amsterdam, The Netherlands, 2009.
Beebe KR, Pell RJ, Seasholtz MB. Chemometrics. A practical guide. New York: Wiley; 1998.
Booksh KS, Kowalski BR. Theory of analytical chemistry. Anal Chem. 1994;66:782A–91A.
Linear soft modelling chapters in Comprehensive chemometrics, Vol2, Section Ed. A. de Juan, General Ed. S.D. Brown, R. Tauler, B. Walczak, Elsevier, Amsterdam, The Netherlands, 2009.
Malinowski ER. Factor analysis in chemistry. New York: John Wiley & Sons; 2002.
Jolliffe IT. Principal component analysis. 2nd ed. New York: Springer Verlag; 2002.
Wold S, Esbensen K, Geladi P. Principal component analysis. Chemom Intell Lab Syst. 1987;2:37–52.
Lee TW. Independent component analysis—theory and applications. Dordrecht: Kluewer Academic Publishers; 1998.
Tauler R. Multivariate curve resolution applied to second order data. Chemom Intell Lab Syst. 1995;30:133–46.
Smilde A, Bro R, Geladi P. Multiway analysis: applications in the chemical sciences. New York: John Wiley & Sons; 2004.
Bro R. PARAFAC tutorial and applications. Chemom Intell Lab Syst. 1997;38:149–71.
Lahat D, Adali T, Jutten C. Multimodal data fusion: an overview of methods, challenges, and prospects. Proc IEEE. 2015;103:1449–77.
Blanchet L, Smolinska A. In: Jung K, editor. Statistical analysis in proteomics. New York, NY: Springer New York; 2016. p. 209–23.
Acar E, Rasmussen MA, Savorani F, Næs T, Bro R. Understanding data fusion within the framework of coupled matrix and tensor factorizations. Chemom Intell Lab Syst. 2013;129:53–63.
Alter O, Brown PO, Botstein D. Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms. PNAS. 2003;100:3351–6.
Bylesjö M, Eriksson D, Kusano M, Moritz T, Trygg J. Data integration in plant biology: the O2PLS method for combined modeling of transcript and metabolite data. Plant J. 2007;52:1181–91.
Löfstedt T, Trygg J. OnPLS - a novel multiblock method for the modelling of predictive and orthogonal variation. J Chemom. 2011;25:441–55.
Schouteden M, Van Deun K, Pattyn S, Van Mechelen I. SCA with rotation to distinguish common and distinctive information in linked data. Behav Res Methods. 2013;45:822–33.
Kuligowski J, Perez-Guaita D, Sanchez-Illana A, Leon-Gonzalez Z, de la Guardia M, Vento M, et al. Analysis of multi-source metabolomic data using joint and individual variation explained (JIVE). Analyst. 2015;140:4521–9.
Qannari EM, Courcoux P, Vigneau E. Common components and specific weights analysis performed on preference data. Food Qual Prefer. 2001;12:365–8.
E. Ortiz-Villanueva, F. Benavente; B. Piña; V. Sanz-Nebot; R. Tauler; J. Jaumot. Data fusion strategies for untargeted metabolomics based on MCR-ALS analysis of CE-MS and LC-MS data. Submitted.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
All participants belong to the chemometrics study group of the Division of Analytical Chemistry of EuCheMS.
Rights and permissions
About this article
Cite this article
Brereton, R.G., Jansen, J., Lopes, J. et al. Chemometrics in analytical chemistry—part I: history, experimental design and data analysis tools. Anal Bioanal Chem 409, 5891–5899 (2017). https://doi.org/10.1007/s00216-017-0517-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00216-017-0517-1