Skip to main content

NIR Data Exploration and Regression by Chemometrics—A Primer

  • Chapter
  • First Online:
Near-Infrared Spectroscopy

Abstract

This chapter is a primer on the use of multivariate data analysis—or chemometrics—to near-infrared spectra. The extraordinary synergy between near-infrared spectroscopy and the data analysis methods called chemometrics has led to a green analytical revolution in practically all areas of life sciences and related industries for quality control and process monitoring. The near-infrared spectroscopy method is nondestructive, rapid and environmentally friendly. However, the most unique advantage of near-infrared spectroscopy is that it can measure samples remotely and unbiased, as is, i.e., solids and liquids without interfering with the sample or sample preparation. The success of near-infrared spectroscopy would not have been possible without the chemometric data processing. This chapter gives an overview, including tricks of the trade, of the most common chemometric techniques for analysis of near-infrared spectral ensembles illustrated by downloadable data examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. S.B. Engelsen, Near infrared spectroscopy—a unique window of opportunities. NIR News 27(5), 14 (2016)

    Article  Google Scholar 

  2. P.C. Williams, K.H. Norris, Near Infrared Technology in the Agricultural and Food Industries (American Association of Cereal Chemists, Inc., St. Paul, Mn, 1987)

    Google Scholar 

  3. B.G. Osborne, T. Fearn, P.H. Hindle, Practical NIR Spectroscopy with Applications in Food and Beverage Analysis (Longman Scientific & Technical, Harlow, Essex, UK, 1986)

    Google Scholar 

  4. R. DiFoggio, Guidelines for applying chemometrics to spectra: feasibility and error propagation. Appl. Spectrosc. 54(3), 94A (2000)

    Article  CAS  Google Scholar 

  5. P. Geladi, K. Esbensen, The start and early history of chemometrics. 1. Selected interviews. J. Chemometrics 4 (5), 337 (1990)

    Google Scholar 

  6. S.B. Engelsen, E. Mikkelsen, L. Munck, New approaches to rapid spectroscopic evaluation of properties in pectic polymers. Progr. Colloid Polym. Sci. 108, 166 (1998)

    Article  CAS  Google Scholar 

  7. Y. Dong, K.M. Sørensen, S. He, S.B. Engelsen, Gum Arabic authentication and mixture quantification by near infrared spectroscopy. Food Control 78 (Supplement C), 144 (2017)

    Google Scholar 

  8. E. Tønning, L. Nørgaard, S.B. Engelsen, L. Pedersen, K.H. Esbensen, Protein heterogeneity in wheat lots using single-seed NIT—A Theory of Sampling (TOS) breakdown of all sampling and analytical errors. Chemometr. Intell. Lab. Syst. 84(1–2), 142 (2006)

    Article  CAS  Google Scholar 

  9. J. Kjeldahl, A new method for the determination of nitrogen in organic bodies. Anal. Chem. 22, 366 (1883)

    Article  Google Scholar 

  10. H.W. Siesler, Y. Ozaki, S. Kawata, H.M. Heise, Near-Infrared Spectroscopy: Principles, Instruments (Wiley-VCH, Applications, 2008)

    Google Scholar 

  11. A. Rinnan, F. van den Berg, S.B. Engelsen, Review of the most common pre-processing techniques for near-infrared spectra. TRAC-trends Anal Chem 28(10), 1201 (2009)

    Article  CAS  Google Scholar 

  12. P. Geladi, D. McDougall, H. Martens, Linearization and scatter-correction for near-infrared reflectance spectra of meat. Appl. Spectrosc. 39(3), 491 (1985)

    Article  Google Scholar 

  13. H. Martens, S.A. Jensen, P. Geladi, N-4000 Stavanger, Norway, p 205 (1983)

    Google Scholar 

  14. R.J. Barnes, M.S. Dhanoa, S.J. Lister, Standard normal variate transformation and de-trending of near-infrared diffuse reflectance spectra. Appl. Spectrosc. 43(5), 772 (1989)

    Article  CAS  Google Scholar 

  15. H. Martens, E. Stark, Extended multiplicative signal correction and spectral interference subtraction: New preprocessing methods for near infrared spectroscopy. J. Pharm. Biomed. Anal. 9(8), 625 (1991)

    Article  CAS  PubMed  Google Scholar 

  16. H. Martens, J.P. Nielsen, S.B. Engelsen, Light scattering and light absorbance separated by extended multiplicative signal correction. Application to near-infrared transmission analysis of powder mixtures. Anal. Chem. 75 (3), 394 (2003)

    Google Scholar 

  17. A. Savitzky, M.J.E. Golay, Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 36, 1627 (1964)

    Article  CAS  Google Scholar 

  18. W.H. Lawton, E.A. Sylvestre, Self modeling curve resolution. Technometrics 13(3), 617 (1971)

    Article  Google Scholar 

  19. A. de Juan, J. Jaumot, R. Tauler, Multivariate curve resolution (MCR). Solving the mixture analysis problem. Anal. Methods 6 (14), 4964 (2014)

    Google Scholar 

  20. J. de Leeuw, F.W. Young, Y. Takane, Additive structure in qualitative data: An alternating least squares method with optimal scaling features. Psychometrika 41(4), 471 (1976)

    Google Scholar 

  21. A. de Juan, R. Tauler, Multivariate curve resolution (MCR) from 2000: Progress in concepts and applications. Crit. Rev. Anal. Chem. 36(3–4), 163 (2006)

    Google Scholar 

  22. T. Fearn, Multivariate Curve Resolution. NIR News 22(1), 18 (2011)

    Google Scholar 

  23. L. Nørgaard, M. Hahn, L.B. Knudsen, I.A. Farhat, S.B. Engelsen, Multivariate near-infrared and Raman spectroscopic quantifications of the crystallinity of lactose in whey permeate powder. Int. Dairy J. 15(12), 1261 (2005)

    Google Scholar 

  24. S. Navea, A. de Juan, R. Tauler, Modeling temperature-dependent protein structural transitions by combined near-IR and mid-IR spectroscopies and multivariate curve resolution. Anal. Chem. 75(20), 5592 (2003)

    CAS  PubMed  Google Scholar 

  25. K. Wojcicki, I. Khmelinskii, M. Sikorski, E. Sikorska, Near and mid infrared spectroscopy and multivariate data analysis in studies of oxidation of edible oils. Food Chem. 187, 416 (2015)

    CAS  PubMed  Google Scholar 

  26. K.M. Sørensen, S.B. Engelsen, The spatial composition of porcine adipose tissue investigated by multivariate curve resolution of near infrared spectra: Relationships between fat, the degree of unsaturation and water. J. Near Infrared Spectrosc. 25(1), 45 (2017)

    Google Scholar 

  27. T.R.M. De Beer, P. Vercruysse, A. Burggraeve, T. Quinten, J. Ouyang, X. Zhang, C. Vervaet, J.P. Remon, W.R.G. Baeyens, In-line and real-time process monitoring of a freeze drying process using Raman and NIR spectroscopy as complementary Process Analytical Technology (PAT) tools. J. Pharm. Sci. 98(9), 3430 (2009)

    PubMed  Google Scholar 

  28. J. Jaumot, A. de Juan, R. Tauler, MCR-ALS GUI 2.0: New features and applications. Chemometr. Intell. Lab. Syst. 140, 1–12 (2014)

    Google Scholar 

  29. K. Pearson, On lines and planes of closest fit to systems of points in space. Phil. Mag. 2, 559 (1901)

    Google Scholar 

  30. H. Hotelling, Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24, 417 (1933)

    Google Scholar 

  31. S. Wold, K. Esbensen, P. Geladi, Principal component analysis. Chemometr. Intell. Lab. Syst. 2(1–3), 37 (1987)

    Article  CAS  Google Scholar 

  32. S. Wold, H. Martens, H. Wold, The multivariate calibration-problem in chemistry solved by the PLS method. Lect. Notes Math. 973, 286 (1983)

    Article  Google Scholar 

  33. H. Hotelling, The relations of the newer multivariate statistical-methods to factor-analysis. Br. J. Stat. Psychol. 10(2), 69 (1957)

    Article  Google Scholar 

  34. H. Martens, S.A. Jensen, in Progress in Cereal Chemistry and Technology ed. by J. Holas, J. Kratochvil, vol. 5a (Elsevier, Amsterdam, 1983)

    Google Scholar 

  35. A. Smilde, R. Bro, P. Geladi, Multi-Way Analysis with Applications in the Chemical Sciences (John Wiley & Sons, Ltd, 2005)

    Google Scholar 

  36. H. Martens, T. Karstang, T. Næs, Improved selectivity in spectroscopy by multivariate calibration. J. Chemom. 1(4), 201 (1987)

    Article  CAS  Google Scholar 

  37. L. Ståhle, S. Wold, Partial least squares analysis with cross-validation for the two-class problem: A Monte Carlo study. J Chemometrics 1 185 (1987)

    Google Scholar 

  38. J.A. Westerhuis, H.C.J. Hoefsloot, S. Smit, D.J. Vis, A.K. Smilde, E.J.J. van Velzen, J.P.M. van Duijnhoven, F.A. van Dorsten, Assessment of PLSDA cross validation. Metabolomics 4(1), 81 (2008)

    Article  CAS  Google Scholar 

  39. D.T. Berhe, C.E. Eskildsen, R. Lametsch, M.S. Hviid, F. van den Berg, S.B. Engelsen, Prediction of total fatty acid parameters and individual fatty acids in pork backfat using Raman spectroscopy and chemometrics: Understanding the cage of covariance between highly correlated fat parameters. Meat Sci. 111, 18 (2016)

    Article  CAS  PubMed  Google Scholar 

  40. F.J. Anscombe, Graphs in statistical-analysis. Am. Stat. 27(1), 17 (1973)

    Google Scholar 

  41. T. Næs, T. Isaksson, SEP or RMSEP, which is best? NIR News 2(4), 16 (1991)

    Article  Google Scholar 

  42. I.N. Wakeling, J.J. Morris, A test of significance for partial least squares regression. J. Chemom. 7(4), 291 (1993)

    Article  CAS  Google Scholar 

  43. S. Wold, Cross-validatory estimation of the number of components in factor and principal components models. Technometrics 20(4), 397 (1978)

    Article  Google Scholar 

  44. H. Martens, P. Dardenne, Validation and verification of regression in small data sets. Chemometr. Intell. Lab. Syst. 44(1–2), 99 (1998)

    Article  CAS  Google Scholar 

  45. D.K. Pedersen, H. Martens, J.P. Nielsen, S.B. Engelsen, Near-infrared absorption and scattering separated by extended inverted signal correction (EISC): Analysis of near-infrared transmittance spectra of single wheat seeds. Appl. Spectrosc. 56(9), 1206 (2002)

    Article  CAS  Google Scholar 

  46. T. Mehmood, K.H. Liland, L. Snipen, S. Saebo, A review of variable selection methods in partial least squares regression. Chemometr. Intell. Lab. Syst. 118, 62 (2012)

    Article  CAS  Google Scholar 

  47. B. Efron, The Jackknife, the Bootstrap and Other Resampling Plans. Soc. Ind. Appl. Math. Philadelphia, Pennsylvania (1982)

    Google Scholar 

  48. H. Martens, M. Martens, Modified Jack-knife estimation of parameter uncertainty in bilinear modelling by Partial Least Squares Regression (PLSR). Food Qual. Prefer. 11(1–2), 5 (2000)

    Article  Google Scholar 

  49. S. Wold, E. Johansson, E. Cocchi, ESCOM, Leiden, Holland (1993) p. 523

    Google Scholar 

  50. I.G. Chong, C.H. Jun, Performance of some variable selection methods when multicollinearity is present. Chemometr. Intell. Lab. Syst. 78(1–2), 103 (2005)

    CAS  Google Scholar 

  51. Å. Rinnan, M. Andersson, C. Ridder, S.B. Engelsen, Recursive weighted partial least squares (rPLS): An efficient variable selection method using PLS. J. Chemom. 28(5), 439 (2014)

    CAS  Google Scholar 

  52. L. Nørgaard, A. Saudland, J. Wagner, J.P. Nielsen, L. Munck, S.B. Engelsen, Interval partial least squares regression (iPLS): A comparative chemometric study with an example from the near infrared spectroscopy. Appl. Spectrosc. 54(3), 413 (2000)

    Google Scholar 

  53. R.A. Fisher, The correlation between relatives on the supposition of Mendelian inheritance. Philos Trans R Soc Edinburgh 52, 399 (1918)

    Google Scholar 

  54. A.K. Smilde, J.J. Jansen, H.C.J. Hoefsloot, R. Lamers, J. van der Greef, M.E. Timmerman, ANOVA-simultaneous component analysis (ASCA): a new tool for analyzing designed metabolomics data. Bioinformatics 21(13), 3043 (2005)

    CAS  PubMed  Google Scholar 

  55. J.A. Westerhuis, E.J.J. van Velzen, H.C.J. Hoefsloot, A.K. Smilde, Multivariate paired data analysis: Multilevel PLSDA versus OPLSDA. Metabolomics 6(1), 119 (2010)

    Article  CAS  PubMed  Google Scholar 

  56. G. Zwanenburg, H.C.J. Hoefsloot, J.A. Westerhuis, J.J. Jansen, A.K. Smilde, ANOVA-principal component analysis and ANOVA-simultaneous component analysis: A comparison. J. Chemom. 25(10), 561 (2011)

    Article  CAS  Google Scholar 

  57. A.L. Pomerantsev, O.Y. Rodionova, Process analytical technology: A critical view of the chemometricians. J. Chemom. 26(6), 299 (2012)

    Article  CAS  Google Scholar 

  58. E. Skibsted, S.B. Engelsen, in Encyclopedia of Spectroscopy and Spectrometry (Second Edition) (Academic Press, Oxford, 2010)

    Google Scholar 

  59. P.C. Williams, Application of near-infrared reflectance spectroscopy to analysis of cereal-grains and oilseeds. Cereal Chem. 52(4), 561 (1975)

    Google Scholar 

  60. G. Huang, G.B. Huang, S.J. Song, K.Y. You, Trends in extreme learning machines: A review. Neural Netw. 61, 32 (2015)

    PubMed  Google Scholar 

  61. S. Mahesh, A. Manickavasagan, D.S. Jayas, J. Paliwal, N.D.G. White, Feasibility of near-infrared hyperspectral imaging to differentiate Canadian wheat classes. Biosyst. Eng. 101(1), 50 (2008)

    Google Scholar 

  62. A.P. Teixeira, R. Oliveira, P.M. Alves, M.J.T. Carrondo, Advances in on-line monitoring and control of mammalian cell cultures: Supporting the PAT initiative. Biotechnol. Adv. 27(6), 726 (2009)

    CAS  PubMed  Google Scholar 

  63. E. Borras, J. Ferre, R. Boque, M. Mestres, L. Acena, O. Busto, Data fusion methodologies for food and beverage authentication and quality assessment—A review. Anal. Chim. Acta 891, 1 (2015)

    CAS  PubMed  Google Scholar 

  64. I. Noda, Generalized 2-dimensional correlation method applicable to infrared, Raman and other types of spectroscopy. Appl. Spectrosc. 47(9), 1329 (1993)

    CAS  Google Scholar 

  65. E. Alm, R. Bro, S.B. Engelsen, B. Karlberg, R.J.O. Torgrip, Vibrational overtone combination spectroscopy (VOCSY)—A new way of using IR and NIR data. Anal. Bioanal. Chem. 388(1), 179 (2007)

    CAS  PubMed  Google Scholar 

  66. C.E. Eskildsen, M.A. Rasmussen, S.B. Engelsen, L.B. Larsen, N.A. Poulsen, T. Skov, Quantification of individual fatty acids in bovine milk by infrared spectroscopy and chemometrics: Understanding predictions of highly collinear reference variables. J. Dairy Sci. 97(12), 7940 (2014)

    CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Søren Balling Engelsen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Sørensen, K.M., van den Berg, F., Engelsen, S.B. (2021). NIR Data Exploration and Regression by Chemometrics—A Primer. In: Ozaki, Y., Huck, C., Tsuchikawa, S., Engelsen, S.B. (eds) Near-Infrared Spectroscopy. Springer, Singapore. https://doi.org/10.1007/978-981-15-8648-4_7

Download citation

Publish with us

Policies and ethics