Advertisement

Analytical and Bioanalytical Chemistry

, Volume 410, Issue 23, pp 5981–5992 | Cite as

Overoptimism in cross-validation when using partial least squares-discriminant analysis for omics data: a systematic study

  • Raquel Rodríguez-Pérez
  • Luis Fernández
  • Santiago Marco
Research Paper

Abstract

Advances in analytical instrumentation have provided the possibility of examining thousands of genes, peptides, or metabolites in parallel. However, the cost and time-consuming data acquisition process causes a generalized lack of samples. From a data analysis perspective, omics data are characterized by high dimensionality and small sample counts. In many scenarios, the analytical aim is to differentiate between two different conditions or classes combining an analytical method plus a tailored qualitative predictive model using available examples collected in a dataset. For this purpose, partial least squares-discriminant analysis (PLS-DA) is frequently employed in omics research. Recently, there has been growing concern about the uncritical use of this method, since it is prone to overfitting and may aggravate problems of false discoveries. In many applications involving a small number of subjects or samples, predictive model performance estimation is only based on cross-validation (CV) results with a strong preference for reporting results using leave one out (LOO). The combination of PLS-DA for high dimensionality data and small sample conditions, together with a weak validation methodology is a recipe for unreliable estimations of model performance. In this work, we present a systematic study about the impact of the dataset size, the dimensionality, and the CV technique used on PLS-DA overoptimism when performance estimation is done in cross-validation. Firstly, by using synthetic data generated from a same probability distribution and with assigned random binary labels, we have obtained a dataset where the true classification rate (CR) is 50%. As expected, our results confirm that internal validation provides overoptimistic estimations of the classification accuracy (i.e., overfitting). We have characterized the CR estimator in terms of bias and variance depending on the internal CV technique used and sample to dimensionality ratio. In small sample conditions, due to the large bias and variance of the estimator, the occurrence of extremely good CRs is common. We have found that overfitting peaks when the sample size in the training subset approaches the feature vector dimensionality minus one. In these conditions, the models are neither under- or overdetermined with a unique solution. This effect is particularly intense for LOO and peaks higher in small sample conditions. Overoptimism is decreased beyond this point where the abundance of noisy produces a regularization effect leading to less complex models. In terms of overfitting, our study ranks CV methods as follows: Bootstrap produces the most accurate estimator of the CR, followed by bootstrapped Latin partitions, random subsampling, K-Fold, and finally, the very popular LOO provides the worst results. Simulation results are further confirmed in real datasets from mass spectrometry and microarrays.

Keywords

Metabolomics Mass spectrometry Microarrays Chemometrics Data analysis Classification Method validation 

Notes

Authors’ contributions

RR wrote the software, analyzed the data, and prepared the figures and text. LF supervised the code of RR and provided useful insights. SM conceived the study and supervised the work. RR and SM authors contributed to writing the manuscript. All authors read and approved the final manuscript.

Funding information

This work was partially funded by the Spanish MINECO program, under grants TEC2011-26143 (SMART-IMS) and TEC2014-59229-R (SIGVOL). The Signal and Information Processing for Sensor Systems group is a consolidated Grup de Recerca de la Generalitat de Catalunya and has support from the Departament d’Universitats, Recerca i Societat de la Informació de la Generalitat de Catalunya (expedient 2017 SGR 1721). This work has received support from the Comissionat per a Universitats i Recerca del DIUE de la Generalitat de Catalunya and the European Social Fund (ESF). Additional financial support has been provided by the Institut de Bioenginyeria de Catalunya (IBEC). IBEC is a member of the CERCA Programme/Generalitat de Catalunya.

Compliance with ethical standards

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Availability of data and material

The microarray dataset analyzed during the current study is publicly available at http://ccb.nki.nl/data/.

Competing interests

The authors declare that they have no competing interests.

References

  1. 1.
    Santana R, Galdiano J, Pérez A, Bielza C, Larrañaga P, Calvo B, et al. Machine learning in bioinformatics machine learning in bioinformatics. Brief Bioinform. 2006;7:1–16.  https://doi.org/10.1093/bib/bbk007.CrossRefGoogle Scholar
  2. 2.
    Kulasingam V, Diamandis EP. Strategies for discovering novel cancer biomarkers through utilization of emerging technologies. Nat Clin Pract Oncol. 2008;5:588–99.  https://doi.org/10.1038/ncponc1187.CrossRefPubMedGoogle Scholar
  3. 3.
    Vinaixa M, Samino S, Saez I, Duran J, Guinovart JJ, Yanes O. A guideline to univariate statistical analysis for LC/MS-based untargeted metabolomics-derived data. Metabolites. 2012;2:775–95.  https://doi.org/10.3390/metabo2040775.CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Bellman R. Adaptive control processes—a guided tour. Z Angew Math Mech. 1962;42:364–5.Google Scholar
  5. 5.
    Bishop CM. Pattern recognition and machine learning. Heidelberg: Springer-Verlag Berlin; 2006.Google Scholar
  6. 6.
    Ghosh D, Poisson LM. “Omics” data and levels of evidence for biomarker discovery. Genomics. 2009;93:13–6.  https://doi.org/10.1016/j.ygeno.2008.07.006.CrossRefPubMedGoogle Scholar
  7. 7.
    Rubingh CM, Bijlsma S, Derks EPP, Bobeldijk I, Verheij ER, Kochhar S, et al. Assessing the performance of statistical validation tools for megavariate metabolomics data. Metabolomics. 2006;2:53–61.  https://doi.org/10.1007/s11306-006-0022-6.CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Westad F, Marini F. Validation of chemometric models—a tutorial. Anal Chim Acta. 2015;893:14–24.  https://doi.org/10.1016/j.aca.2015.06.056.CrossRefPubMedGoogle Scholar
  9. 9.
    Marco S. The need for external validation in machine olfaction: emphasis on health-related applications chemosensors and chemoreception. Anal Bioanal Chem. 2014;406:3941–56.  https://doi.org/10.1007/s00216-014-7807-7.CrossRefPubMedGoogle Scholar
  10. 10.
    Kennard RW, Stone LA. Computer aided design of experiments. Technometrics. 1969;11:137–48.  https://doi.org/10.1080/00401706.1969.10490666.CrossRefGoogle Scholar
  11. 11.
    Galvão RKH, Araujo MCU, José GE, Pontes MJC, Silva EC, Saldanha TCB. A method for calibration and validation subset partitioning. Talanta. 2005;67:736–40.  https://doi.org/10.1016/j.talanta.2005.03.025.CrossRefPubMedGoogle Scholar
  12. 12.
    Barker M, Rayens W. Partial least squares for discrimination. J Chemom. 2003;17:166–73.  https://doi.org/10.1002/cem.785.CrossRefGoogle Scholar
  13. 13.
    Chevallier S, Bertrand D, Kohler A, Courcoux P. Application of PLS-DA in multivariate image analysis. J Chemom. 2006;20:221–9.  https://doi.org/10.1002/cem.994.CrossRefGoogle Scholar
  14. 14.
    Sirven J-B, Sallé B, Mauchien P, Lacour J-L, Maurice S, Manhès G. Feasibility study of rock identification at the surface of Mars by remote laser-induced breakdown spectroscopy and three chemometric methods. J Anal At Spectrom. 2007;22:1471.  https://doi.org/10.1039/b704868h.CrossRefGoogle Scholar
  15. 15.
    Ciosek P, Wróblewski W. Miniaturized electronic tongue with an integrated reference microelectrode for the recognition of milk samples. Talanta. 2008;76:548–56.  https://doi.org/10.1016/j.talanta.2008.03.051.CrossRefPubMedGoogle Scholar
  16. 16.
    Ivorra E, Girón J, Sánchez AJ, Verdú S, Barat JM, Grau R. Detection of expired vacuum-packed smoked salmon based on PLS-DA method using hyperspectral images. J Food Eng. 2013;117:342–9.  https://doi.org/10.1016/j.jfoodeng.2013.02.022.CrossRefGoogle Scholar
  17. 17.
    Bassbasi M, De Luca M, Ioele G, Oussama A, Ragno G. Prediction of the geographical origin of butters by partial least square discriminant analysis (PLS-DA) applied to infrared spectroscopy (FTIR) data. J Food Compos Anal. 2014;33:210–5.  https://doi.org/10.1016/j.jfca.2013.11.010.CrossRefGoogle Scholar
  18. 18.
    Lo Y-L, Pan W-H, Hsu W-L, Chien Y-C, Chen J-Y, Hsu M-M, et al. Partial least square discriminant analysis discovered a dietary pattern inversely associated with nasopharyngeal carcinoma risk. PLoS One. 2016.  https://doi.org/10.1371/journal.pone.0155892.
  19. 19.
    Pérez-Enciso M, Tenenhaus M. Prediction of clinical outcome with microarray data: a partial least squares discriminant analysis (PLS-DA) approach. Hum Genet. 2003;112:581–92.  https://doi.org/10.1007/s00439-003-0921-9.CrossRefPubMedGoogle Scholar
  20. 20.
    Boulesteix AL, Strimmer K. Partial least squares: a versatile tool for the analysis of high-dimensional genomic data. Brief Bioinform. 2007;8:32–44.  https://doi.org/10.1093/bib/bbl016.CrossRefPubMedGoogle Scholar
  21. 21.
    Izquierdo-García JL, Rodríguez I, Kyriazis A, Villa P, Barreiro P, Desco M, et al. A novel R-package graphic user interface for the analysis of metabonomic profiles. BMC Bioinformatics. 2009;10.  https://doi.org/10.1186/1471-2105-10-363.
  22. 22.
    Biswas A, Mynampati KC, Umashankar S, Reuben S, Parab G, Rao R, et al. Metdat: a modular and workflow-based free online pipeline for mass spectrometry data processing, analysis and interpretation. Bioinformatics. 2010;26:2639–40.  https://doi.org/10.1093/bioinformatics/btq436.CrossRefPubMedGoogle Scholar
  23. 23.
    Smolinska A, Blanchet L, Buydens LMC, Wijmenga SS. NMR and pattern recognition methods in metabolomics: from data acquisition to biomarker discovery: a review. Anal Chim Acta. 2012;750:82–97.  https://doi.org/10.1016/j.aca.2012.05.049.CrossRefPubMedGoogle Scholar
  24. 24.
    Sugimoto M, Kawakami M, Robert M, Soga T, Tomita M. Bioinformatics tools for mass spectroscopy-based metabolomic data processing and analysis. Curr Bioinforma. 2012;7:96–108.  https://doi.org/10.2174/157489312799304431.CrossRefGoogle Scholar
  25. 25.
    Cauchi M, Fowler DP, Walton C, Turner C, Jia W, Whitehead RN, et al. Application of gas chromatography mass spectrometry (GC-MS) in conjunction with multivariate classification for the diagnosis of gastrointestinal diseases. Metabolomics. 2014;10:1113–20.CrossRefGoogle Scholar
  26. 26.
    Bro R, Kamstrup-Nielsen MH, Engelsen SB, Savorani F, Rasmussen MA, Hansen L, et al. Forecasting individual breast cancer risk using plasma metabolomics and biocontours. Metabolomics. 2015;11:1376–80.  https://doi.org/10.1007/s11306-015-0793-8.CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Garreta-Lara E, Campos B, Barata C, Lacorte S, Tauler R. Metabolic profiling of Daphnia magna exposed to environmental stressors by GC–MS and chemometric tools. Metabolomics. 2016;12.  https://doi.org/10.1007/s11306-016-1021-x.
  28. 28.
    Fang J, Wang W, Sun S, Wang Y, Li Q, Lu X, et al. Metabolomics study of renal fibrosis and intervention effects of total aglycone extracts of Scutellaria baicalensis in unilateral ureteral obstruction rats. J Ethnopharmacol. 2016;192:20–9.  https://doi.org/10.1016/j.jep.2016.06.014.CrossRefPubMedGoogle Scholar
  29. 29.
    Lämmerhofer M, Weckwerth W. Metabolomics in practice successful strategies to generate and analyze metabolic data. Weinheim, Germany: Wiley-VCH Verlag GmbH & Co. KGaA; 2013.CrossRefGoogle Scholar
  30. 30.
    Broadhurst DI, Kell DB. Statistical strategies for avoiding false discoveries in metabolomics and related experiments. Metabolomics. 2006;2:171–96.  https://doi.org/10.1007/s11306-006-0037-z.CrossRefGoogle Scholar
  31. 31.
    Gromski PS, Muhamadali H, Ellis DI, Xu Y, Correa E, Turner ML, et al. A tutorial review: metabolomics and partial least squares-discriminant analysis - a marriage of convenience or a shotgun wedding. Anal Chim Acta. 2015;879:10–23.  https://doi.org/10.1016/j.aca.2015.02.012.CrossRefPubMedGoogle Scholar
  32. 32.
    Eriksson L, Johansson E, Kettaneh-Wold N, Wold S. Introduction to multi-and megavariate data analysis using projection methods (PCA & PLS). Umea: Umetrics AB; 1999.Google Scholar
  33. 33.
    Mehmood T, Liland KH, Snipen L, Saebø S. A review of variable selection methods in partial least squares regression. Chemom Intell Lab Syst. 2012;118:62–9.  https://doi.org/10.1016/j.chemolab.2012.07.010.CrossRefGoogle Scholar
  34. 34.
    Westerhuis JA, Hoefsloot HCJ, Smit S, Vis DJ, Smilde AK, Velzen EJJ, et al. Assessment of PLSDA cross validation. Metabolomics. 2008;4:81–9.  https://doi.org/10.1007/s11306-007-0099-6.CrossRefGoogle Scholar
  35. 35.
    Brereton RG, Lloyd GR. Partial least squares discriminant analysis: taking the magic away. J Chemom. 2014;28:213–25.  https://doi.org/10.1002/cem.2609.CrossRefGoogle Scholar
  36. 36.
    Sousa PF, Åberg KM. Can we beat overfitting?—a closer look at Cloarec’s PLS algorithm. J Chemom. 2018:e3002.  https://doi.org/10.1002/cem.3002.
  37. 37.
    Agne K, Alexander HJ, Marcis L, Juozas K, Hossam H, Hermann B. Detection of cancer through exhaled breath: a systematic review. Oncotarget. 2015;6.  https://doi.org/10.18632/oncotarget.5938.
  38. 38.
    Steyerberg EW, Bleekerb SE, Moll HA, Grobbee DE, Moons KGM. Internal and external validation of predictive models: a simulation study of bias and precision in small samples. J Clin Epidemiol. 2003;56:441–7.  https://doi.org/10.1016/S0895-4356(03)00047-7.CrossRefPubMedGoogle Scholar
  39. 39.
    Kim J-H. Estimating classification error rate: repeated cross-validation, repeated hold-out and Bootstrap. Comput Stat Data Anal. 2009;53:3735–45.  https://doi.org/10.1016/J.CSDA.2009.04.009.CrossRefGoogle Scholar
  40. 40.
    Jiang G, Wang W. Error estimation based on variance analysis of k-fold cross-validation. Pattern Recogn. 2017;69:94–106.  https://doi.org/10.1016/j.patcog.2017.03.025.CrossRefGoogle Scholar
  41. 41.
    Wong TT. Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recogn. 2015;48:2839–46.  https://doi.org/10.1016/j.patcog.2015.03.009.CrossRefGoogle Scholar
  42. 42.
    Filzmoser P, Liebmann B, Varmuza K. Repeated double cross validation. J Chemom. 2009;23:160–71.  https://doi.org/10.1002/cem.1225.CrossRefGoogle Scholar
  43. 43.
    Anderssen E, Dyrstad K, Westad F, Martens H. Reducing over-optimism in variable selection by cross-model validation. Chemom Intell Lab Syst. 2006;84:69–74.  https://doi.org/10.1016/J.CHEMOLAB.2006.04.021.CrossRefGoogle Scholar
  44. 44.
    Martens H, Martens M. Modified Jack-knife estimation of parameter uncertainty in bilinear modelling by partial least squares regression (PLSR). Food Qual Prefer. 2000;11:5–16.  https://doi.org/10.1016/S0950-3293(99)00039-7.CrossRefGoogle Scholar
  45. 45.
    Kjeldahl K, Bro R. Some common misunderstanding in chemometrics. J Chemom. 2010;24:558–64.CrossRefGoogle Scholar
  46. 46.
    Xia J, Broadhurst DI, Wilson M, Wishart DS. Translational biomarker discovery in clinical metabolomics: an introductory tutorial. Metabolomics. 2013;9:280–99.  https://doi.org/10.1007/s11306-012-0482-9.CrossRefPubMedGoogle Scholar
  47. 47.
    Kohavi R (2016) A study of cross-validation and Bootstrap for accuracy estimation and model selection. IJCAI’95 Proceedings of the 14th International Joint Conference on Artificial Intelligence 2:1137–1143.Google Scholar
  48. 48.
    Molinaro AM, Simon R, Pfeiffer RM. Prediction error estimation: a comparison of resampling methods. Bioinformatics. 2005;21:3301–7.  https://doi.org/10.1093/bioinformatics/bti499.CrossRefPubMedGoogle Scholar
  49. 49.
    Wood I, Visscher PM, Mengersen KL. Classification based upon gene expression data: bias and precision of error rates. Bioinformatics. 2007;23:1363–70.  https://doi.org/10.1093/bioinformatics/btm117.CrossRefPubMedGoogle Scholar
  50. 50.
    Boulesteix AL, Strobl C. Optimal classifier selection and negative bias in error rate estimation: an empirical study on high-dimensional prediction. BMC Med Res Methodol. 2009;9.  https://doi.org/10.1186/1471-2288-9-85.
  51. 51.
    Szymańska E, Saccenti E, Smilde AK, Westerhuis JA. Double-check: validation of diagnostic statistics for PLS-DA models in metabolomics studies. Metabolomics. 2012;8:3–16.  https://doi.org/10.1007/s11306-011-0330-3.CrossRefPubMedGoogle Scholar
  52. 52.
    Triba MN, Le Moyec L, Amathieu R, Goossens C, Bouchemal N, Nahon P, et al. PLS/OPLS models in metabolomics: the impact of permutation of dataset rows on the K-fold cross-validation quality parameters. Mol BioSyst. 2015;11:13–9.  https://doi.org/10.1039/C4MB00414K.CrossRefPubMedGoogle Scholar
  53. 53.
    Braga-Neto UM, Dougherty ER. Is cross-validation valid for small-sample microarray classification? Bioinformatics. 2004;20:374–80.  https://doi.org/10.1093/bioinformatics/btg419.CrossRefPubMedGoogle Scholar
  54. 54.
    Fu WJ, Carroll RJ, Wang S. Estimating misclassification error with small samples via Bootstrap cross-validation. Bioinformatics. 2005;21:1979–86.  https://doi.org/10.1093/bioinformatics/bti294.CrossRefPubMedGoogle Scholar
  55. 55.
    Varma S, Simon R. Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics. 2006;10:91.  https://doi.org/10.1186/1471-2105-7-91.CrossRefGoogle Scholar
  56. 56.
    Phatak A, De Jong S. The geometry of partial least squares. J Chemom. 1997;11:311–38.  https://doi.org/10.1002/(SICI)1099-128X(199707)11:4<311::AID-CEM478>3.0.CO;2-4.CrossRefGoogle Scholar
  57. 57.
    Wold SSM, Eriksson L. PLS-regression: a basic tool of chemometrics. Chemom Intell Lab Syst. 2001;58:109–30.CrossRefGoogle Scholar
  58. 58.
    Mevik B-HBHB, Wehrens R. The pls package: principal component and partial least squares regression in R. J Stat Softw. 2007;2007:18.Google Scholar
  59. 59.
    Stone M. Cross-validatory choice and assessment of statistical predictions. J R Stat Soc. 1974;36:111–47.  https://doi.org/10.2307/2984809.CrossRefGoogle Scholar
  60. 60.
    Burman P. A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning testing methods. Biometrika. 1989;76:503–14.CrossRefGoogle Scholar
  61. 61.
    Efron B, Tibshirani R. Estimating the error rate of a prediction rule. J Am Stat Assoc. 1983;78:316–31.  https://doi.org/10.1080/01621459.1983.10477973.CrossRefGoogle Scholar
  62. 62.
    Efron B, Tibshirani R. Improvements on cross-validation: the 632+ Bootstrap method. J Am Stat Assoc. 1997;92:548–60.Google Scholar
  63. 63.
    Brereton R. Chemometrics for pattern recognition. Chichester: Wiley; 2009.Google Scholar
  64. 64.
    de Boves HP. Statistical validation of classification and calibration models using bootstrapped Latin partitions. TrAC-Trends Anal Chem. 2006;25:1112–24.  https://doi.org/10.1016/j.trac.2006.10.010.CrossRefGoogle Scholar
  65. 65.
    Cruciani G, Baroni M, Clementi S, Costantino G, Riganelli D, Skagerberg B. Predictive ability of regression models. Part I: standard deviation of prediction errors (SDEP). J Chemom. 1992;6:335–46.  https://doi.org/10.1002/cem.1180060604.CrossRefGoogle Scholar
  66. 66.
    Wan C, Harrington P d B. Screening GC-MS data for carbamate pesticides with temperature-constrained–cascade correlation neural networks. Anal Chim Acta. 2000;408:1–12.  https://doi.org/10.1016/S0003-2670(99)00865-X.CrossRefGoogle Scholar
  67. 67.
    Harrington P d B. Multiple versus single set validation of multivariate models to avoid mistakes. Crit Rev Anal Chem. 2018;48:33–46.  https://doi.org/10.1080/10408347.2017.1361314.CrossRefPubMedGoogle Scholar
  68. 68.
    Harrington PB, Laurent C, Levinson DF, Levitt P, Markey SP. Bootstrap classification and point-based feature selection from age-staged mouse cerebellum tissues of matrix assisted laser desorption/ionization mass spectra using a fuzzy rule-building expert system. Anal Chim Acta. 2007;599:219–31.  https://doi.org/10.1016/j.aca.2007.08.007.CrossRefPubMedPubMedCentralGoogle Scholar
  69. 69.
    de Boves HP. Support vector machine classification trees based on fuzzy entropy of classification. Anal Chim Acta. 2017;954:14–21.  https://doi.org/10.1016/J.ACA.2016.11.072.CrossRefGoogle Scholar
  70. 70.
    Aloglu AK, Harrington PB, Sahin S, Demir C. Prediction of total antioxidant activity of Prunella L. species by automatic partial least square regression applied to 2-way liquid chromatographic UV spectral images. Talanta. 2016;161:503–10.  https://doi.org/10.1016/j.talanta.2016.09.014.CrossRefPubMedGoogle Scholar
  71. 71.
    Rearden P, Harrington PB, Karnes JJ, Bunker CE. Fuzzy rule-building expert system classification of fuel using solid-phase microextraction two-way gas chromatography differential mobility spectrometric data. Anal Chem. 2007;79:1485–91.  https://doi.org/10.1021/ac060527f.CrossRefPubMedGoogle Scholar
  72. 72.
    Van’t Veer LJ, Dai H, Van de Vijver MJ, He YD, Hart AAM, Mao M, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002;415:530–6.  https://doi.org/10.1038/415530a.CrossRefGoogle Scholar
  73. 73.
    van de Vijver MJ, He YD, van’t Veer LJ, Dai H, Hart AAM, Voskuil DW, et al. A gene-expression signature as a predictor of survival in breast Cancer. N Engl J Med. 2002;347:1999–2009.  https://doi.org/10.1056/NEJMoa021967.CrossRefPubMedGoogle Scholar
  74. 74.
    Guyon I, Li J, Mader T, Pletscher PA, Schneider G, Uhr M. Competitive baseline methods set new standards for the NIPS 2003 feature selection benchmark. Pattern Recogn Lett. 2007;28:1438–44.  https://doi.org/10.1016/j.patrec.2007.02.014.CrossRefGoogle Scholar
  75. 75.
    Bogdanov M, Matson WR, Wang L, Matson T, Saunders-Pullman R, Bressman SS, et al. Metabolomic profiling to develop blood biomarkers for Parkinson’s disease. Brain. 2008;131:389–96.  https://doi.org/10.1093/brain/awm304.CrossRefPubMedGoogle Scholar
  76. 76.
    Abaffy T, Möller MG, Riemer DD, Milikowski C, DeFazio RA. Comparative analysis of volatile metabolomics signals from melanoma and benign skin: a pilot study. Metabolomics. 2013;9:998–1008.  https://doi.org/10.1007/s11306-013-0523-z.CrossRefPubMedPubMedCentralGoogle Scholar
  77. 77.
    Bean HD, Jiménez-Díaz J, Zhu J, Hill JE. Breathprints of model murine bacterial lung infections are linked with immune response. Eur Respir J. 2015;45:181–90.  https://doi.org/10.1183/09031936.00015814.CrossRefPubMedGoogle Scholar
  78. 78.
    D’Amico A, Di Natale C, Paolesse R, Macagnano A, Martinelli E, Pennazza G, et al. Olfactory systems for medical applications. Sensors Actuators B Chem. 2008;130:458–65.  https://doi.org/10.1016/j.snb.2007.09.044.CrossRefGoogle Scholar
  79. 79.
    Franceschi P, Masuero D, Vrhovsek U, Mattivi F, Wehrens R. A benchmark spike-in data set for biomarker identification in metabolomics. J Chemom. 2012;26:16–24.  https://doi.org/10.1002/cem.1420.CrossRefGoogle Scholar
  80. 80.
    Schmekel B, Winquist F, Vikström A. Analysis of breath samples for lung cancer survival. Anal Chim Acta. 2014;840:82–6.  https://doi.org/10.1016/j.aca.2014.05.034.CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Signal and Information Processing for Sensing Systems, Institute for Bioengineering of CataloniaThe Barcelona Institute for Science and TechnologyBarcelonaSpain
  2. 2.Department of Electronics and Biomedical EngineeringUniversity of BarcelonaBarcelonaSpain

Personalised recommendations