Skip to main content
Log in

Characterization of data analysis methods for information recovery from metabolic 1H NMR spectra using artificial complex mixtures

  • Original Article
  • Published:
Metabolomics Aims and scope Submit manuscript

Abstract

The assessment of data analysis methods in 1H NMR based metabolic profiling is hampered owing to a lack of knowledge of the exact sample composition. In this study, an artificial complex mixture design comprising two artificially defined groups designated normal and disease, each containing 30 samples, was implemented using 21 metabolites at concentrations typically found in human urine and having a realistic distribution of inter-metabolite correlations. These artificial mixtures were profiled by 1H NMR spectroscopy and used to assess data analytical methods in the task of differentiating the two conditions. When metabolites were individually quantified, volcano plots provided an excellent method to track the effect size and significance of the change between conditions. Interestingly, the Welch t test detected a similar set of metabolites changing between classes in both quantified and spectral data, suggesting that differential analysis of 1H NMR spectra using a false discovery rate correction, taking into account fold changes, is a reliable approach to detect differential metabolites in complex mixture studies. Various multivariate regression methods based on partial least squares (PLS) were applied in discriminant analysis mode. The most reliable methods in quantified and spectral 1H NMR data were PLS and RPLS linear and logistic regression respectively. A jackknife based strategy for variable selection was assessed on both quantified and spectral data and results indicate that it may be possible to improve on the conventional Orthogonal-PLS methodology in terms of accuracy and sensitivity. A key improvement of our approach consists of objective criteria to select significant signals associated with a condition that provides a confidence level on the discoveries made, which can be implemented in metabolic profiling studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Allen, J., Davey, H. M., Broadhurst, D., Heald, J. K., Rowland, J. J., Oliver, S. G., et al. (2003). High-throughput classification of yeast mutants for functional genomics using metabolic footprinting. Nature Biotechnology, 21, 692–696.

    Article  PubMed  CAS  Google Scholar 

  • Bundy, J., Davey, M., & Viant, M. (2009). Environmental metabolomics: a critical review and future perspectives. Metabolomics, 5, 3–21. doi:10.1007/s11306-008-0152-0.

    Article  CAS  Google Scholar 

  • Chadeau-Hyam, M., Ebbels, T. M. D., Brown, I. J., Chan, Q., Stamler, J., Huang, C. C., et al. (2010). Metabolic profiling and the metabolome-wide association study: significance level for biomarker identification. Journal of Proteome Research, 9, 4620–4627. doi:10.1021/pr1003449.

    Article  PubMed  CAS  Google Scholar 

  • Cloarec, O., Dumas, M. E., Craig, A., Barton, R. H., Trygg, J., Hudson, J., et al. (2005a). Statistical total correlation spectroscopy: an exploratory approach for latent biomarker identification from metabolic 1H NMR data sets. Analytical Chemistry, 77, 1282–1289. doi:10.1021/ac048630x.

    Article  PubMed  CAS  Google Scholar 

  • Cloarec, O., Dumas, M. E., Trygg, J., Craig, A., Barton, R. H., Lindon, J. C., et al. (2005b). Evaluation of the orthogonal projection on latent structure model limitations caused by chemical shift variability and improved visualization of biomarker changes in 1H NMR spectroscopic metabonomic studies. Analytical Chemistry, 77, 517–526. doi:10.1021/ac048803i.

    Article  PubMed  CAS  Google Scholar 

  • Couto Alves, A., Rantalainen, M., Holmes, E., Nicholson, J. K., & Ebbels, T. M. (2009). Analytic properties of statistical total correlation spectroscopy based information recovery in (1)H NMR metabolic data sets. Analytical Chemistry,. doi:10.1021/ac801982h.

    Google Scholar 

  • Craig, A., Cloarec, O., Holmes, E., Nicholson, J. K., & Lindon, J. C. (2006). Scaling and normalization effects in NMR spectroscopic metabonomic data sets. Analytical Chemistry, 78, 2262–2267.

    Article  PubMed  CAS  Google Scholar 

  • Dieterle, F., Ross, A., Schlotterbeck, G., & Senn, H. (2006). Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in H-1 NMR metabonomics. Analytical Chemistry, 78, 4281–4290.

    Article  PubMed  CAS  Google Scholar 

  • Ding, B., & Gentleman, R. (2004). Classification using generalized partial least squares. Bioconductor Project Working Papers, 5.

  • Dumas, M.-E., Maibaum, E. C., Teague, C., Ueshima, H., Zhou, B., Lindon, J. C., et al. (2006). Assessment of analytical reproducibility of 1H NMR spectroscopy based metabonomics for large-scale epidemiological research:  the INTERMAP study. Analytical Chemistry, 78, 2199–2208. doi:10.1021/ac0517085.

    Article  PubMed  CAS  Google Scholar 

  • Fiehn, O., Kopka, J., Dormann, P., Altmann, T., Trethewey, R. N., & Willmitzer, L. (2000). Metabolite profiling for plant functional genomics. Nat Biotech, 18, 1157–1161.

    Article  CAS  Google Scholar 

  • Fort, G. (2005). Inference in logistic regression models. http://perso.telecom-paristech.fr/~gfort/GLM/Programs.html.

  • Fort, G., & Lambert-Lacroix, S. (2005). Classification using partial least squares with penalized logistic regression. Bioinformatics, 21, 1104–1111. doi:10.1093/bioinformatics/bti114.

    Article  PubMed  CAS  Google Scholar 

  • Holmes, E., Loo, R. L., Stamler, J., Bictash, M., Yap, I. K. S., Chan, Q., et al. (2008). Human metabolic phenotype diversity and its association with diet and blood pressure. Nature, 453, 396–400.

    Article  PubMed  CAS  Google Scholar 

  • Keun, H. C., Ebbels, T. M. D., Antti, H., Bollard, M. E., Beckonert, O., Schlotterbeck, G., et al. (2002). Analytical reproducibility in 1H NMR-based metabonomic urinalysis. Chemical Research in Toxicology, 15, 1380–1386. doi:10.1021/tx0255774.

    Article  PubMed  CAS  Google Scholar 

  • Lindon, J., Nicholson, J., & Holmes, E. (2007). The handbook of metabonomics and metabolomics. Amsterdam: Elsevier Science.

    Google Scholar 

  • Lloyd, S. (2003). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28, 129–137.

    Article  Google Scholar 

  • Marx, B. D. (1996). Iteratively reweighted partial least squares estimation for generalized linear regression. Technometrics, 38, 374–381.

    Article  Google Scholar 

  • Muncey, H., Jones, R., De Iorio, M., & Ebbels, T. (2010). MetAssimulo: simulation of realistic NMR metabolic profiles. BMC Bioinformatics, 11, 496.

    Article  PubMed  Google Scholar 

  • Nguyen, D. V., & Rocke, D. M. (2002). Multi-class cancer classification via partial least squares with gene expression profiles. Bioinformatics, 18, 1216–1226. doi:10.1093/bioinformatics/18.9.1216.

    Article  PubMed  CAS  Google Scholar 

  • Saude, E. J., Adamko, D., Rowe, B. H., Marrie, T., & Sykes, B. D. (2007). Variation of metabolites in normal human urine. Metabolomics, 3, 439–451.

    Article  CAS  Google Scholar 

  • Trygg, J., & Wold, S. (2002). Orthogonal projections to latent structures (O-PLS). Journal of Chemometrics, 16, 119–128.

    Article  CAS  Google Scholar 

  • Westerhuis, J., Hoefsloot, H., Smit, S., Vis, D., Smilde, A., van Velzen, E., et al. (2008). Assessment of PLSDA cross validation. Metabolomics, 4, 81–89. doi:10.1007/s11306-007-0099-6.

    Article  CAS  Google Scholar 

  • Wishart, D. S., Tzur, D., Knox, C., Eisner, R., Guo, A. C., Young, N., et al. (2007). HMDB: the human metabolome database. Nucleic Acids Research, 35, D521–D526.

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

Alexessander Couto Alves acknowledges an Imperial College Faculty of Medicine PhD studentship.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Elaine Holmes or Timothy M. D. Ebbels.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOC 754 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alves, A.C., Li, J.V., Garcia-Perez, I. et al. Characterization of data analysis methods for information recovery from metabolic 1H NMR spectra using artificial complex mixtures. Metabolomics 8, 1170–1180 (2012). https://doi.org/10.1007/s11306-012-0422-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11306-012-0422-8

Keywords

Navigation