, Volume 8, Issue 6, pp 1170–1180 | Cite as

Characterization of data analysis methods for information recovery from metabolic 1H NMR spectra using artificial complex mixtures

  • Alexessander C. Alves
  • Jia V. Li
  • Isabel Garcia-Perez
  • Caroline Sands
  • Coral Barbas
  • Elaine Holmes
  • Timothy M. D. Ebbels
Original Article


The assessment of data analysis methods in 1H NMR based metabolic profiling is hampered owing to a lack of knowledge of the exact sample composition. In this study, an artificial complex mixture design comprising two artificially defined groups designated normal and disease, each containing 30 samples, was implemented using 21 metabolites at concentrations typically found in human urine and having a realistic distribution of inter-metabolite correlations. These artificial mixtures were profiled by 1H NMR spectroscopy and used to assess data analytical methods in the task of differentiating the two conditions. When metabolites were individually quantified, volcano plots provided an excellent method to track the effect size and significance of the change between conditions. Interestingly, the Welch t test detected a similar set of metabolites changing between classes in both quantified and spectral data, suggesting that differential analysis of 1H NMR spectra using a false discovery rate correction, taking into account fold changes, is a reliable approach to detect differential metabolites in complex mixture studies. Various multivariate regression methods based on partial least squares (PLS) were applied in discriminant analysis mode. The most reliable methods in quantified and spectral 1H NMR data were PLS and RPLS linear and logistic regression respectively. A jackknife based strategy for variable selection was assessed on both quantified and spectral data and results indicate that it may be possible to improve on the conventional Orthogonal-PLS methodology in terms of accuracy and sensitivity. A key improvement of our approach consists of objective criteria to select significant signals associated with a condition that provides a confidence level on the discoveries made, which can be implemented in metabolic profiling studies.


Artificial mixtures Data analysis t test PLS NMR 



Alexessander Couto Alves acknowledges an Imperial College Faculty of Medicine PhD studentship.

Supplementary material

11306_2012_422_MOESM1_ESM.doc (754 kb)
Supplementary material 1 (DOC 754 kb)


  1. Allen, J., Davey, H. M., Broadhurst, D., Heald, J. K., Rowland, J. J., Oliver, S. G., et al. (2003). High-throughput classification of yeast mutants for functional genomics using metabolic footprinting. Nature Biotechnology, 21, 692–696.PubMedCrossRefGoogle Scholar
  2. Bundy, J., Davey, M., & Viant, M. (2009). Environmental metabolomics: a critical review and future perspectives. Metabolomics, 5, 3–21. doi: 10.1007/s11306-008-0152-0.CrossRefGoogle Scholar
  3. Chadeau-Hyam, M., Ebbels, T. M. D., Brown, I. J., Chan, Q., Stamler, J., Huang, C. C., et al. (2010). Metabolic profiling and the metabolome-wide association study: significance level for biomarker identification. Journal of Proteome Research, 9, 4620–4627. doi: 10.1021/pr1003449.PubMedCrossRefGoogle Scholar
  4. Cloarec, O., Dumas, M. E., Craig, A., Barton, R. H., Trygg, J., Hudson, J., et al. (2005a). Statistical total correlation spectroscopy: an exploratory approach for latent biomarker identification from metabolic 1H NMR data sets. Analytical Chemistry, 77, 1282–1289. doi: 10.1021/ac048630x.PubMedCrossRefGoogle Scholar
  5. Cloarec, O., Dumas, M. E., Trygg, J., Craig, A., Barton, R. H., Lindon, J. C., et al. (2005b). Evaluation of the orthogonal projection on latent structure model limitations caused by chemical shift variability and improved visualization of biomarker changes in 1H NMR spectroscopic metabonomic studies. Analytical Chemistry, 77, 517–526. doi: 10.1021/ac048803i.PubMedCrossRefGoogle Scholar
  6. Couto Alves, A., Rantalainen, M., Holmes, E., Nicholson, J. K., & Ebbels, T. M. (2009). Analytic properties of statistical total correlation spectroscopy based information recovery in (1)H NMR metabolic data sets. Analytical Chemistry,. doi: 10.1021/ac801982h.Google Scholar
  7. Craig, A., Cloarec, O., Holmes, E., Nicholson, J. K., & Lindon, J. C. (2006). Scaling and normalization effects in NMR spectroscopic metabonomic data sets. Analytical Chemistry, 78, 2262–2267.PubMedCrossRefGoogle Scholar
  8. Dieterle, F., Ross, A., Schlotterbeck, G., & Senn, H. (2006). Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in H-1 NMR metabonomics. Analytical Chemistry, 78, 4281–4290.PubMedCrossRefGoogle Scholar
  9. Ding, B., & Gentleman, R. (2004). Classification using generalized partial least squares. Bioconductor Project Working Papers, 5.Google Scholar
  10. Dumas, M.-E., Maibaum, E. C., Teague, C., Ueshima, H., Zhou, B., Lindon, J. C., et al. (2006). Assessment of analytical reproducibility of 1H NMR spectroscopy based metabonomics for large-scale epidemiological research:  the INTERMAP study. Analytical Chemistry, 78, 2199–2208. doi: 10.1021/ac0517085.PubMedCrossRefGoogle Scholar
  11. Fiehn, O., Kopka, J., Dormann, P., Altmann, T., Trethewey, R. N., & Willmitzer, L. (2000). Metabolite profiling for plant functional genomics. Nat Biotech, 18, 1157–1161.CrossRefGoogle Scholar
  12. Fort, G. (2005). Inference in logistic regression models.
  13. Fort, G., & Lambert-Lacroix, S. (2005). Classification using partial least squares with penalized logistic regression. Bioinformatics, 21, 1104–1111. doi: 10.1093/bioinformatics/bti114.PubMedCrossRefGoogle Scholar
  14. Holmes, E., Loo, R. L., Stamler, J., Bictash, M., Yap, I. K. S., Chan, Q., et al. (2008). Human metabolic phenotype diversity and its association with diet and blood pressure. Nature, 453, 396–400.PubMedCrossRefGoogle Scholar
  15. Keun, H. C., Ebbels, T. M. D., Antti, H., Bollard, M. E., Beckonert, O., Schlotterbeck, G., et al. (2002). Analytical reproducibility in 1H NMR-based metabonomic urinalysis. Chemical Research in Toxicology, 15, 1380–1386. doi: 10.1021/tx0255774.PubMedCrossRefGoogle Scholar
  16. Lindon, J., Nicholson, J., & Holmes, E. (2007). The handbook of metabonomics and metabolomics. Amsterdam: Elsevier Science.Google Scholar
  17. Lloyd, S. (2003). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28, 129–137.CrossRefGoogle Scholar
  18. Marx, B. D. (1996). Iteratively reweighted partial least squares estimation for generalized linear regression. Technometrics, 38, 374–381.CrossRefGoogle Scholar
  19. Muncey, H., Jones, R., De Iorio, M., & Ebbels, T. (2010). MetAssimulo: simulation of realistic NMR metabolic profiles. BMC Bioinformatics, 11, 496.PubMedCrossRefGoogle Scholar
  20. Nguyen, D. V., & Rocke, D. M. (2002). Multi-class cancer classification via partial least squares with gene expression profiles. Bioinformatics, 18, 1216–1226. doi: 10.1093/bioinformatics/18.9.1216.PubMedCrossRefGoogle Scholar
  21. Saude, E. J., Adamko, D., Rowe, B. H., Marrie, T., & Sykes, B. D. (2007). Variation of metabolites in normal human urine. Metabolomics, 3, 439–451.CrossRefGoogle Scholar
  22. Trygg, J., & Wold, S. (2002). Orthogonal projections to latent structures (O-PLS). Journal of Chemometrics, 16, 119–128.CrossRefGoogle Scholar
  23. Westerhuis, J., Hoefsloot, H., Smit, S., Vis, D., Smilde, A., van Velzen, E., et al. (2008). Assessment of PLSDA cross validation. Metabolomics, 4, 81–89. doi: 10.1007/s11306-007-0099-6.CrossRefGoogle Scholar
  24. Wishart, D. S., Tzur, D., Knox, C., Eisner, R., Guo, A. C., Young, N., et al. (2007). HMDB: the human metabolome database. Nucleic Acids Research, 35, D521–D526.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Alexessander C. Alves
    • 1
  • Jia V. Li
    • 1
  • Isabel Garcia-Perez
    • 1
  • Caroline Sands
    • 1
  • Coral Barbas
    • 2
  • Elaine Holmes
    • 1
  • Timothy M. D. Ebbels
    • 1
  1. 1.Section of Biomolecular Medicine, Department of Surgery and Cancer, Faculty of MedicineImperial College LondonLondonUK
  2. 2.CEMBIO (Center for Metabolomics and Bioanalysis), Pharmacy Faculty, Campus MonteprincipeSan Pablo-CEU UniversityBoadilla del MonteSpain

Personalised recommendations