Skip to main content
Log in

Predicting Indonesian coffee origins using untargeted SPME − GCMS - based volatile compounds fingerprinting and machine learning approaches

  • Original Paper
  • Published:
European Food Research and Technology Aims and scope Submit manuscript

Abstract

Aroma is a fundamental property of coffee and is modulated by volatile compounds (VCs). The VC fingerprint based on untargeted SPME-GC/MS can be used for coffee origin prediction. Coffee of Indonesian origin has been attracting considerable research interest in recent years. However, data analysis of the untargeted VC analysis remains one of the greatest challenges for the prediction of origins. Therefore, this study aimed to investigate the ability of untargeted SPME-GC/MS-based volatile fingerprinting to predict the origins of Indonesian coffee using machine learning (ML) approaches. Indonesian coffee samples from economically precious origins were studied. An SPME arrow was used to extract the VCs from the coffee headspace with optimised temperatures coupled to a GC/MS system. Several machine learning models were compared to obtain the most accurate origin prediction. This study found that adsorption at 70 °C for 10 min extracted the most reliable VCs for coffee origin prediction. The untargeted headspace-SPME volatile fingerprint of coffee employing 200 samples identified 224 features out of 656 detected signals. In the exploratory dataset, RF and PLS-DA models are comparable in predicting Indonesian coffee origins with accuracies of 97% and 95.2%, respectively. They also reached an AUC of 100% and 95.8% in the validation dataset, respectively. Furthermore, both models indicated promising results in selecting the important features. These features illustrate a clear classification in the visualisation using unsupervised models. Overall, the results of the study demonstrate the reliability of the current workflow for the predictive modelling of Indonesian coffees. This study contributes to the advancement of coffee origin prediction and classification for further authentication.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

The datasets generated and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Abbreviations

AUC:

Area under the curve

CAR/DVB/PDMS:

Carboxen/divinylbenzene/polydimethylsiloxane

CM:

Confusion matrix

GC/MS:

Gas chromatography mass spectrometry

HCA:

Hierarchical clustering analysis

HPLC:

High-Performance liquid chromatography

HS-SPME:

Headspace solid phase microextraction

LC/MS:

Liquid chromatography mass spectrometry

MDA:

Mean-decreased accuracy

MDG:

Mean-decreased in Gini

ML:

Machine learning

PAH:

Polycyclic aromatic hydrocarbons

PCA:

Principal component analysis

RF:

Random forest

PLS-DA:

Partial least square discriminant analysis

RI:

Retention index

RT:

Retention time

SVM:

Support vector machine

SS:

Similarity score

TIC:

Total ion chromatogram

kNN:

K-Nearest neighbour

VC(s):

Volatile compound(s)

VIP:

Variable important in projection

References

  1. BPS - Statistics Indonesia (2022) Ekspor Kopi Menurut Negara Tujuan Utama, 2000–2021 (Eng: Coffee Export According to Country of Destination, 2000–2021). In: Publ Stat Indones. https://www.bps.go.id/statictable/2014/09/08/1014/ekspor-kopi-menurut-negara-tujuan-utama-2000-2021.html. Accessed 26 Dec 2022

  2. Sunarharum WB, Williams DJ, Smyth HE (2014) Complexity of coffee flavor: a compositional and sensory perspective. Food Res Int 62:315–325. https://doi.org/10.1016/j.foodres.2014.02.030

    Article  CAS  Google Scholar 

  3. Wei L, Wai M, Curran P, Yu B, Quan S (2016) Modulation of coffee aroma via the fermentation of green coffee beans with Rhizopus oligosporus: I. Green coffee. Food Chem 211:916–924. https://doi.org/10.1016/j.foodchem.2016.05.076

    Article  CAS  Google Scholar 

  4. Liu C, Yang Q, Linforth R, Fisk ID, Yang N (2019) Modifying Robusta coffee aroma by green bean chemical pre-treatment. Food Chem 272:251–257. https://doi.org/10.1016/j.foodchem.2018.07.226

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Casas MI, Vaughan MJ, Bonello P, McSpadden Gardener B, Grotewold E, Alonso AP (2017) Identification of biochemical features of defective Coffea arabica L. beans. Food Res Int 95:59–67. https://doi.org/10.1016/j.foodres.2017.02.015

    Article  CAS  PubMed  Google Scholar 

  6. Elhalis H, Cox J, Frank D, Zhao J (2021) The role of wet fermentation in enhancing coffee flavor, aroma and sensory quality. Eur Food Res Technol 247:485–498. https://doi.org/10.1007/s00217-020-03641-6

    Article  CAS  Google Scholar 

  7. Caporaso N, Whitworth MB, Cui C, Fisk ID (2018) Variability of single bean coffee volatile compounds of Arabica and robusta roasted coffees analysed by SPME-GC-MS. Food Res Int 108:628–640. https://doi.org/10.1016/j.foodres.2018.03.077

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Bodner M, Morozova K, Kruathongsri P, Thakeow P, Scampicchio M (2019) Effect of harvesting altitude, fermentation time and roasting degree on the aroma released by coffee powder monitored by proton transfer reaction mass spectrometry. Eur Food Res Technol 245:1499–1506. https://doi.org/10.1007/s00217-019-03281-5

    Article  CAS  Google Scholar 

  9. Cotter A, Hopfer H (2018) The effects of storage temperature on the aroma of whole bean Arabica Coffee evaluated by coffee consumers and HS-SPME-GC-MS. Beverages 4:68. https://doi.org/10.3390/beverages4030068

    Article  CAS  Google Scholar 

  10. Aurum FS, Imaizumi T, Manasikan T, Praseptiangga D, Nakano K (2022) Coffee origin determination based on analytical and nondestructive approaches-a systematic literature review. Rev Agric Sci 10:257–287. https://doi.org/10.7831/ras.10.0_257

    Article  Google Scholar 

  11. Eom I, Jung M (2013) Identification of coffee fragrances using needle trap device-gas chromatography/mass spectrometry (NTD-GC/MS). Bull Korean Chem Soc 34:1703. https://doi.org/10.5012/bkcs.2013.34.6.1703

    Article  CAS  Google Scholar 

  12. dos Santos HD, Boffo EF (2021) Coffee beyond the cup: analytical techniques used in chemical composition research—a review. Eur Food Res Technol 247:749–775. https://doi.org/10.1007/s00217-020-03679-6

    Article  CAS  Google Scholar 

  13. Pawliszyn J (1997) Solid phase microextraction: theory and practice. John Wiley & Sons, Inc, Ontario

    Google Scholar 

  14. Xu C, Chen G, Xiong Z, Fan Y, Wang X (2016) Applications of solid-phase microextraction in food analysis. Trends Anal Chem 80:12–29. https://doi.org/10.1016/j.trac.2016.02.022

    Article  CAS  Google Scholar 

  15. Kremser A, Jochmann MA, Schmidt TC (2016) PAL SPME arrow—evaluation of a novel solid-phase microextraction device for freely dissolved PAHs in water. Anal Bioanal Chem 408:943–952. https://doi.org/10.1007/s00216-015-9187-z

    Article  CAS  PubMed  Google Scholar 

  16. Song NE, Lee JY, Lee YY, Park JD, Jang HW (2019) Comparison of headspace–SPME and SPME-Arrow–GC–MS methods for the determination of volatile compounds in Korean salt–fermented fish sauce. Appl Biol Chem. https://doi.org/10.1186/s13765-019-0424-6

    Article  Google Scholar 

  17. Miguel L, Barreira F, Duporté G, Rönkkö T, Parshintsev J, Hartonen K (2018) Field measurements of biogenic volatile organic compounds in the atmosphere using solid-phase microextraction arrow. Atmos Meas Tech 11:881–893. https://doi.org/10.5194/amt-11-881-2018

    Article  CAS  Google Scholar 

  18. Zainal PW, Aurum FS, Imaizumi T, Thammawong M, Nakano K (2022) Applications of mass spectrometry-based metabolomics in postharvest research. Rev Agric Sci 10:56–67. https://doi.org/10.7831/ras.10.0_56

    Article  Google Scholar 

  19. Thorburn Burns D, Tweed L, Walker MJ (2017) Ground roast coffee: review of analytical strategies to estimate geographic origin, species authenticity and adulteration by dilution. Food Anal Methods 10:2302–2310. https://doi.org/10.1007/s12161-016-0756-3

    Article  Google Scholar 

  20. Aurum FS, Imaizumi T, Thammawong M, Suhandy D, Praseptiangga D, Tsuta M, Nagata M, Nakano K (2022) Lipidomic profiling of Indonesian coffee to determine its geographical origin by LC–MS/MS. Eur Food Res Technol 248:2887–2899. https://doi.org/10.1007/s00217-022-04098-5

    Article  CAS  Google Scholar 

  21. Amalia F, Aditiawati P, Yusianto PSP, Fukusaki E (2021) Gas chromatography/mass spectrometry-based metabolite profiling of coffee beans obtained from different altitudes and origins with various postharvest processing. Metabolomics 17:69. https://doi.org/10.1007/s11306-021-01817-z

    Article  CAS  PubMed  Google Scholar 

  22. Putri SP, Irifune T, Yusianto FE (2019) GC/MS based metabolite profiling of Indonesian specialty coffee from different species and geographical origin. Metabolomics 15:126. https://doi.org/10.1007/s11306-019-1591-5

    Article  CAS  PubMed  Google Scholar 

  23. Yulia M, Ningtyas KR, Kuncoro S, Tamrin, Suhandy D (2022) Discrimination of several robusta organic coffees from Sumatra Indonesia with different origins using UV spectroscopy and principal component analysis. In: IOP Conference Series: Earth and Environmental Science. p 012065

  24. Tsugawa H, Cajka T, Kind T, Ma Y, Higgins B, Ikeda K, Kanazawa M, VanderGheynst J, Fiehn O, Arita M (2015) MS-DIAL: data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nat Methods 12:523–526. https://doi.org/10.1038/nmeth.3393

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Goodacre R, Broadhurst D, Smilde AK et al (2007) Proposed minimum reporting standards for data analysis in metabolomics. Metabolomics 3:231–241. https://doi.org/10.1007/s11306-007-0081-3

    Article  CAS  Google Scholar 

  26. Worley B, Powers R (2013) Multivariate analysis in metabolomics. Curr Metabol 1:92–107. https://doi.org/10.2174/2213235x11301010092

    Article  CAS  Google Scholar 

  27. Risticevic S, Carasek E, Pawliszyn J (2008) Headspace solid-phase microextraction–gas chromatographic–time-of-flight mass spectrometric methodology for geographical origin verification of coffee. Anal Chim Acta 617:72–84. https://doi.org/10.1016/j.aca.2008.04.009

    Article  CAS  PubMed  Google Scholar 

  28. Cincotta F, Tripodi G, Merlino M, Verzera A, Condurso C (2020) Variety and shelf-life of coffee packaged in capsules. LWT Food Sci Technol 118:108718. https://doi.org/10.1016/j.lwt.2019.108718

    Article  CAS  Google Scholar 

  29. Lancioni C, Castells C, Candal R, Tascon M (2022) Advances in sample preparation headspace solid-phase microextraction: fundamentals and recent advances. Adv Sample Prep 3:100035. https://doi.org/10.1016/j.sampre.2022.100035

    Article  Google Scholar 

  30. Kataoka H, Lord HL, Pawliszyn J (2000) Applications of solid-phase microextraction in food analysis. J Chromatogr A 880:35–62. https://doi.org/10.1016/S0021-9673(00)00309-5

    Article  CAS  PubMed  Google Scholar 

  31. Ongo EA, Montevecchi G, Antonelli A, Sberveglieri V, Sevilla F III (2020) Metabolomics fingerprint of Philippine coffee by SPME-GC-MS for geographical and varietal classification. Food Res Int 134:109227. https://doi.org/10.1016/j.foodres.2020.109227

    Article  CAS  PubMed  Google Scholar 

  32. Leobet EL, Perin EC, Fontanini JIC, Prado NV, Oro SR, Burgardt VCF, Alfaro AT, Machado-Lunkes A (2020) Effect of the drying process on the volatile compounds and sensory quality of agglomerated instant coffee. Dry Technol 38:1421–1432. https://doi.org/10.1080/07373937.2019.1644347

    Article  CAS  Google Scholar 

  33. Ropelewska E, Piecko J (2022) Discrimination of tomato seeds belonging to different cultivars using machine learning. Eur Food Res Technol 248:685–705. https://doi.org/10.1007/s00217-021-03920-w

    Article  CAS  Google Scholar 

  34. Hoyos Ossa DE, Gil-Solsona R, Peñuela GA, Sancho JV, Hernández FJ (2018) Assessment of protected designation of origin for Colombian coffees based on HRMS-based metabolomics. Food Chem 250:89–97. https://doi.org/10.1016/j.foodchem.2018.01.038

    Article  CAS  PubMed  Google Scholar 

  35. Kim S, Lee SS, Bang E, Lee SS, Rhee J, Na Y (2019) Comparative evaluation of flavor compounds in fermented green and roasted coffee beans by solid phase microextraction-gas chromatography/mass spectrometry. Flavour Fragr J 34:365–376. https://doi.org/10.1002/ffj.3517

    Article  CAS  Google Scholar 

  36. Lê Cao K-A, Welham ZM (2021) Multivariate data integration using R. Chapman and Hall/CRC, Boca Raton

    Book  Google Scholar 

  37. Wang L, Naser FJ, Spalding JL, Patti GJ (2019) A protocol to compare methods for untargeted metabolomics. Methods Mol Biol 1862:1–15. https://doi.org/10.1007/978-1-4939-8769-6

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Yakub I, Abdalla Y, Musa M, Yusup S, Singh A, Kabir F (2016) Valorization of Bambara groundnut shell via intermediate pyrolysis: products distribution and characterization. J Clean Prod 139:717–728. https://doi.org/10.1016/j.jclepro.2016.08.090

    Article  CAS  Google Scholar 

  39. Vale S, Vin G, Pereira DM, De CP, Rodrigues C, Pagnoncelli MGB, Soccol CR (2019) Effect of co-inoculation with Pichia fermentans and Pediococcus acidilactici on metabolite produced during fermentation and volatile composition of coffee beans. Fermentation 5:1–17. https://doi.org/10.3390/fermentation5030067

    Article  CAS  Google Scholar 

  40. Jimenez EJM, Martins PMM, de Vilela OAL, Batista NN, da Rosa SDVF, Dias DR, Schwan RF (2023) Influence of anaerobic fermentation and yeast inoculation on the viability, chemical composition, and quality of coffee. Food Biosci 51:102218. https://doi.org/10.1016/j.fbio.2022.102218

    Article  CAS  Google Scholar 

  41. Ram VJ, Sethi A, Nath M, Pratap R (2019) The chemistry of heterocycles nomenclature and chemistry of three-to-five membered heterocycles. Elsevier Ltd, Amsterdam. https://doi.org/10.1016/C2015-0-05990-1

    Book  Google Scholar 

  42. Klūga A, Terentjeva M, Vukovic NL, Kačániová M (2021) Antimicrobial activity and chemical composition of essential oils against pathogenic microorganisms of freshwater fish. Plants 10:1265. https://doi.org/10.3390/plants10071265

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Zhong L, Wang Y, Peng W, Liu Y, Wan J, Yang S, Li L, Wu C, Zhou X (2015) Headspace solid-phase microextraction coupled with gas chromatography-mass spectrometric analysis of volatile components of raw and stir-fried fruit of C. Pinnatifida (FCP). Trop J Pharm Res 14:891–898. https://doi.org/10.4314/tjpr.v14i5.20

    Article  CAS  Google Scholar 

  44. Janusz A, Capone DL, Puglisi CJ, Perkins MV, Elsey GM, Sefton MA (2003) (E)-1-(2,3,6-Trimethylphenyl)buta-1,3-diene: a potent grape-derived odorant in wine. J Agric Food Chem 51:7759–7763. https://doi.org/10.1021/jf0347113

    Article  CAS  PubMed  Google Scholar 

  45. Pereira V, Cacho J, Marques JC (2014) Volatile profile of Madeira wines submitted to traditional accelerated ageing. Food Chem 162:122–134. https://doi.org/10.1016/j.foodchem.2014.04.039

    Article  CAS  PubMed  Google Scholar 

  46. Vanderhaegen B, Neven H, Verstrepen KJ, Delvaux FR, Verachtert H, Derdelinckx G (2004) Influence of the brewing process on furfuryl ethyl ether formation during beer aging. J Agric Food Chem 52:6755–6764. https://doi.org/10.1021/jf0490854

    Article  CAS  PubMed  Google Scholar 

  47. Medeiros J, Xu S, Pickering GJ, Kemp BS (2022) influence of caffeic and caftaric acid, fructose, and storage temperature on furan derivatives in base wine. Molecules 27:1–16. https://doi.org/10.3390/molecules27227891

    Article  CAS  Google Scholar 

  48. Abdelwareth A, Zayed A, Farag MA (2021) Chemometrics-based aroma profiling for revealing origin, roasting indices, and brewing method in coffee seeds and its commercial blends in the Middle East. Food Chem 349:129162. https://doi.org/10.1016/j.foodchem.2021.129162

    Article  CAS  PubMed  Google Scholar 

  49. Rohart F, Gautier B, Singh A, Lê Cao K-A (2017) mixOmics: an R package for ‘omics feature selection and multiple data integration. PLoS Comput Biol 13(11):e1005752. https://doi.org/10.1371/journal.pcbi.1005752

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Pang Z, Chong J, Li S, Xia J (2020) MetaboAnalystR 3.0: toward an optimized workflow for global metabolomics. Metabolites 10(5):186. https://doi.org/10.3390/metabo10050186

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22. https://cogns.northwestern.edu/cbmg/LiawAndWiener2002.pdf. Accessed 3 Dec 2022

  52. Karatzoglou A, Meyer D, Hornik K (2006) Support vector machines in R. J Stat Softw 15(9):1–28. https://doi.org/10.18637/jss.v015.i09

    Article  Google Scholar 

  53. Venables WN, Ripley BD (2002) Statistics complements to modern applied statistics with S. Springer. https://www.stats.ox.ac.uk/pub/MASS4/. Accessed 24 Dec 2022

Download references

Acknowledgements

The authors acknowledge the support for FSA from UGSAS, Gifu University and the Doctoral Program of Universitas Sebelas Maret for the double degree sponsorships. We also thank Mr. Jun Iwata, Baisen Ko-Bo Sora coffee roaster for providing professional roasting equipment and assisting the roasting method.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Danar Praseptiangga.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest that influences the work of this study.

Compliance with ethics requirements

This research does not involve any studies with human or animal subjects.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 319 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aurum, F.S., Imaizumi, T., Thammawong, M. et al. Predicting Indonesian coffee origins using untargeted SPME − GCMS - based volatile compounds fingerprinting and machine learning approaches. Eur Food Res Technol 249, 2137–2149 (2023). https://doi.org/10.1007/s00217-023-04281-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00217-023-04281-2

Keywords

Navigation