Variable Selection Applied to the Development of a Robust Method for the Quantification of Coffee Blends Using Mid Infrared Spectroscopy

Abstract

This paper combined attenuated total reflectance Fourier transform infrared spectroscopy (ATR-FTIR), multivariate calibration with partial least squares (PLS), and different variable selection methods for the development of models to determine Robusta-Arabica coffee blends in the analytical range from 0.0 to 33.0% w/w. Ground samples of different origins were roasted at three different levels: light, medium, and dark. Specific models were built for each roasting level, and a robust model was also obtained including all the samples. Mid infrared spectra were recorded in the wavenumber range between 4000 and 800 cm−1 for the 120 samples used in the models. Four variable selection methods were tested: genetic algorithm (GA), ordered predictors selection (OPS), successive projections algorithm (SPA), and interval PLS (iPLS). The best results were obtained using GA and OPS, decreasing root mean square errors of prediction (RMSEP) in 44–68% as compared to full spectra models. The best robust model was obtained with OPS, providing RMSEP of 1.8% w/w. The number of selected variable in the optimized models varied from 6.5 to 17.0% of the total number of original variables. This demonstrated the importance of selecting a limited number of wavenumbers richer in information specifically related to the analytes. All the methods were validated by estimating appropriate figures of merit and considered accurate, linear, sensitive, and unbiased.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. Andersen CM, Bro R (2010) Variable selection in regression-a tutorial. J Chemom 24:728–737

    CAS  Article  Google Scholar 

  2. Arana VA, Medina J, Alarcon R, Moreno E, Heintz L, Schäfer H, Wist J (2015) Coffee’s country of origin determined by NMR: the Colombian case. Food Chem 175:500–506

    CAS  Article  Google Scholar 

  3. Araújo MCU, Saldanha TCB, Galvão RKH, Yoneyama T, Chame HC, Visani V (2001) The successive projections algorithm for variable selection in spectroscopic multicomponent analysis. Chemom Intell Lab Syst 57:65–73

    Article  Google Scholar 

  4. ASTM (2012) Standard practices for infrared multivariate quantitative analysis—E1655-05. ASTM International, West Conshohocken

    Google Scholar 

  5. Barbin DF, Felicio ALSM, Sun DW, Nixdorf SL, Hirooka EY (2014) Application of infrared spectral techniques on quality and compositional attributes of coffee: an overview. Food Res Int 61:23–32

    CAS  Article  Google Scholar 

  6. Bertone E, Venturello A, Giraudo A, Pellegrino G, Geobaldo F (2016) Simultaneous determination by NIR spectroscopy of the roasting degree and Arabica/Robusta ratio in roasted and ground coffee. Food Control 59:683–689

    CAS  Article  Google Scholar 

  7. Botelho BG, Mendes BAP, Sena MM (2013) Development and analytical validation of robust near-infrared multivariate calibration models for the quality inspection control of mozzarella cheese. Food Anal Methods 6:881–891

    Article  Google Scholar 

  8. Broadhurst D, Goodacre R, Jones A, Rowland JJ, Kell DB (1997) Genetic algorithms as a method for variable selection in multiple linear regression and partial least squares regression, with applications to pyrolysis mass spectrometry. Anal Chim Acta 348:71–86

    CAS  Article  Google Scholar 

  9. Cagliani LR, Pellegrino G, Giugno G, Consonni R (2013) Quantification of Coffea arabica and Coffea canephora var robusta in roasted and ground coffee blends. Talanta 106:169–173

    CAS  Article  Google Scholar 

  10. Caprioli G, Cortese M, Cristalli G, Maggi F, Odello L, Ricciutelli M, Sagratini G, Sirocchi V, Tomassoni G, Vittori S (2012) Optimization of espresso machine parameters through the analysis of coffee odorants by HS-SPME-GC/MS. Food Chem 135:1127–1133

    CAS  Article  Google Scholar 

  11. Carvalho A, Fazouli LC, Teixeira AA, Guerreiro O (1990) Use of Excelsa coffee in blends with Arabica. Bragantia 49:335–343

    Article  Google Scholar 

  12. Costa Filho PA, Poppi RJ (1999) Algoritmo genético em química. Quim Nova 22:405–411

    Article  Google Scholar 

  13. Craig AP, Franca AS, Oliveira LS (2011) Discrimination between immature and mature green coffees by attenuated total reflectance and diffuse reflectance Fourier transform infrared spectroscopy. J Food Sci 76:1162–1168

    Article  Google Scholar 

  14. Damatta FM, Ronchi CP, Maestri M, Barros RS (2007) Ecophysiology of coffee growth and production. Braz J Plant Physiol 19:485–510

    CAS  Article  Google Scholar 

  15. El-Abassy RM, Donfack P, Materny A (2011) Discrimination between Arabica and Robusta green coffee using visible micro Raman spectroscopy and chemometric analysis. Food Chem 126:1443–1448

    CAS  Article  Google Scholar 

  16. Ferreira MH, Braga JWB, Sena MM (2013) Development and validation of a chemometric method for direct determination of hydrochlorothiazide in pharmaceutical samples by diffuse reflectance near infrared spectroscopy. Microchem J 109:158–164

    CAS  Article  Google Scholar 

  17. Garrett R, Vaz BG, Hovell AMC, Eberlin MN, Rezende CM (2012) Arabica and Robusta coffees: identification of major polar compounds and quantification of blends by direct-infusion electrospray ionization-mass spectrometry. J Agric Food Chem 60:4253–4258

    CAS  Article  Google Scholar 

  18. Garrett R, Rezende CM, Ifa DR (2013) Coffee origin discrimination by paper spray mass spectrometry and direct coffee spray analysis. Anal Methods 5:5944–5948

    CAS  Article  Google Scholar 

  19. Garruti RS, Carvalho A, Tosello Y (1975) Qualidade da bebida em blends de cafés Arábica e Robusta. (in Portuguese). Cienc Cult 27:482

    Google Scholar 

  20. Goicoechea HC, Olivieri AC (2003) A new family of genetic algorithms for wavelength interval selection in multivariate analytical spectroscopy. J Chemom 17:338–345

    CAS  Article  Google Scholar 

  21. Gomes AA, Galvão RHK, Araújo MCU, Veras G, Silva EC (2013) The successive projections algorithm for interval selection in PLS. Microchem J 110:202–208

    CAS  Article  Google Scholar 

  22. Grinshpun H (2014) Deconstructing a global commodity: coffee, culture, and consumption in Japan. J Consum Cult 14:343–364

    Article  Google Scholar 

  23. Hertz-Schünemann R, Streibel T, Ehlert S, Zimmermann R (2013) Looking into individual coffee beans during the roasting process: direct micro-probe sampling on-line photo-ionisation mass spectrometric analysis of coffee roasting gases. Anal Bioanal Chem 405:7083–7096

    Article  Google Scholar 

  24. ICO (2016) Country data on the global coffee trade. International Coffee Organization, http://www.ico.org/profiles_e.asp. Accessed 24 May 2017

  25. Inmetro (1999) Informação ao Consumidor: Café Torrado e Moído (in Portuguese). Instituto Nacional de Metrologia, Qualidade e Tecnologia, Rio de Janeiro, http://www.sitedoconsumidor.gov.br/consumidor/produtos/cafe.asp. Accessed 24 May 2017

  26. Jumhawan U, Putri SP, Yusianto BT, Fukusaki E (2016) Quantification of coffee blends for authentication of Asian palm civet coffee (Kopi Luwak) via metabolomics : a proof of concept. J Biosci Bioeng 122:79–84

    CAS  Article  Google Scholar 

  27. Kemsley EK, Ruault S, Wilson RH (1995) Discrimination between Coffea arabica and coffea canephora variant robusta beans using infrared-spectroscopy. Food Chem 54:321–326

    CAS  Article  Google Scholar 

  28. Kennard RW, Stone LA (1969) Computer aided design of experiment. Technometrics 11:137–148

    Article  Google Scholar 

  29. Lucasius CB, Kateman G (1991) Genetic algorithms for large-scale optimization in chemometrics: an application. TrAC, Trends Anal Chem 10:254–261

    CAS  Article  Google Scholar 

  30. Ludwig I, Clifford M, Ashihara H, Crozier A (2014) Coffee: biochemistry and potential impact on health. Food Funct 29:1695–1717

    Article  Google Scholar 

  31. Mills CE, Oruna-Concha MJ, Mottram DS, Gibson GR, Spencer JPE (2013) The effect of processing on chlorogenic acid content of commercially available coffee. Food Chem 141:3335–3340

    CAS  Article  Google Scholar 

  32. Monakhova YB, Ruge W, Kuballa T, Ilse M, Wilkelmann O, Diehl B, Thomas F, Lachenmeier DW (2015) Rapid approach to identify the presence of Arabica and Robusta species in coffee using 1H NMR spectroscopy. Food Chem 182:178–184

    CAS  Article  Google Scholar 

  33. Nicolai BM, Katrien B, Bobelyn E, Peirs A, Sayeus W, Theron KI, Lammertyn J (2007) Nondestructive measurement of fruit and vegetable quality by means of NIR spectroscopy: a review. Postharvest Biol Technol 46:99–118

    Article  Google Scholar 

  34. Norgaard L, Saudland A, Wagner J, Nielsen JP, Munck L, Egelsen SB (2000) Interval partial least-squares regression (iPLS): a comparative chemometric study with an example from near-infrared spectroscopy. Appl Spectrosc 54:413–419

    CAS  Article  Google Scholar 

  35. Pacetti D, Boselli E, Balzano M, Frega NG (2012) Authentication of Italian espresso coffee blends through the GC peak ratio between kahweol and 16-O-methylcafestol. Food Chem 135:1569–1574

    CAS  Article  Google Scholar 

  36. Paradkar MM, Irudayaraj J (2002) Rapid determination of caffeine content in soft drinks using FTIR-ATR spectroscopy. Food Chem 78:261–266

    CAS  Article  Google Scholar 

  37. Pizarro C, Esteban-Díez I, González-Sáiz JM (2007) Mixture resolution according to the percentage of robusta variety in order to detect adulteration in roasted coffee by near infrared spectroscopy. Anal Chim Acta 585:266–276

    CAS  Article  Google Scholar 

  38. Reis N, Franca AS, Oliveira LS (2013) Discrimination between roasted coffee, roasted corn and coffee husks by Diffuse Reflectance Infrared Fourier Transform Spectroscopy. LWT - Food Science and Technology 50(2):715–722

  39. Reis N, Franca AS, Oliveira LS (2016) Concomitant use of Fourier transform infrared attenuated total reflectance spectroscopy and chemometrics for quantification of multiple adulterants in roasted and ground coffee. J Spectrosc 2016:1–7

    Article  Google Scholar 

  40. Rinnan Å, van den Berg F, Engelsen SB (2009) Review of the most common pre-processing techniques for near-infrared spectra. TrAC, Trends Anal Chem 28:1201–1222

    CAS  Article  Google Scholar 

  41. Schievano E, Finotello C, Angelis E, Mammi S, Navarini L (2014) Rapid authentication of coffee blends and quantification of 16-O-methylcafestol in roasted coffee beans by nuclear magnetic resonance. J Agric Food Chem 62:12309–12314

    CAS  Article  Google Scholar 

  42. Soares SFC, Gomes AA, Galvão Filho AR, Araújo MCU, Galvão RKH (2013) The successive projections algorithm. TrAC, Trends Anal Chem 42:84–98

    CAS  Article  Google Scholar 

  43. Sorol N, Arancibia E, Bortolato AS, Olivieri AC (2010) Visible/near infrared-partial least-squares analysis of Brix in sugar cane juice: a test field for variable selection methods. Chemom Intel Lab Syst 102:100–109

    CAS  Article  Google Scholar 

  44. Souza SVC, Junqueira RG (2005) A procedure to assess linearity by ordinary least squares method. Anal Chim Acta 552:25–35

    Article  Google Scholar 

  45. Teófilo RF, Martins JPA, Ferreira MMC (2009) Sorting variables by using informative vectors as a strategy for feature selection in multivariate regression. J Chemom 23:32–48

    Article  Google Scholar 

  46. Valderrama P, Braga JWB, Poppi RJ (2009) State of the art of figures of merit in multivariate calibration. Quim Nova 32:1278–1287

    CAS  Article  Google Scholar 

  47. Wang J, Jun S, Bittenbender HC, Gautz L, Li QX (2009) Fourier transform infrared spectroscopy for Kona coffee authentication. J Food Sci 74:385–391

    Article  Google Scholar 

  48. Wang N, Fu Y, Lim L (2011) Feasibility study on chemometric discrimination of roasted Arabica coffees by solvent extraction and Fourier transform infrared spectroscopy. J Agric Food Chem 59:3220–3226

    CAS  Article  Google Scholar 

  49. Wermelinger T, D’Ambrosio L, Klopprogge B, Yeretzian C (2011) Quantification of the robusta fraction in a coffee blend via Raman spectroscopy: proof of principle. J Agric Food Chem 59:9074–9079

    CAS  Article  Google Scholar 

Download references

Acknowledgements

The authors are grateful to CAPES and FAPEMIG for funding this research and to Laboratório de Biocombustíveis (UFMG, Belo Horizonte, Brazil) for allowing the use of the MIR spectrometer.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Marcelo M. Sena.

Ethics declarations

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed Consent

Not applicable.

Conflict of Interest

Camila Assis declares that she has no conflict of interest. Leandro Soares Oliveira declares that he has no conflict of interest. Marcelo Martins Sena declares that he has no conflict of interest.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Assis, C., Oliveira, L.S. & Sena, M.M. Variable Selection Applied to the Development of a Robust Method for the Quantification of Coffee Blends Using Mid Infrared Spectroscopy. Food Anal. Methods 11, 578–588 (2018). https://doi.org/10.1007/s12161-017-1027-7

Download citation

Keywords

  • Coffee blends
  • FTIR
  • Multivariate calibration
  • PLS regression
  • Variable selection
  • Food authentication