Experiment design beyond gut feeling: statistical tests and power to detect differential metabolites in mass spectrometry data

Trutschel, Diana; Schmidt, Stephan; Grosse, Ivo; Neumann, Steffen

doi:10.1007/s11306-014-0742-y

Experiment design beyond gut feeling: statistical tests and power to detect differential metabolites in mass spectrometry data

Original Article
Published: 02 November 2014

Volume 11, pages 851–860, (2015)
Cite this article

Metabolomics Aims and scope Submit manuscript

Diana Trutschel¹,
Stephan Schmidt¹,
Ivo Grosse^2,3 &
…
Steffen Neumann¹

1432 Accesses
19 Citations
4 Altmetric
Explore all metrics

Abstract

Univariate hypotheses tests such as Student’s t test or variance analysis (ANOVA) can help to answer a variety of questions in metabolomics data analysis. The statistical power of these tests depends on the setup of the experiment, the experimental design and the analytical variance of the actual observations. In this paper, we demonstrate how a well-designed pilot study prior to an experiment with the aim to find differences between e.g. several genotypes, can help to determine the variance at multiple levels ranging from biological variance, sample preparation to instrumental variances. Next, we illustrate how these variances can be used to obtain several parameters (e.g. minimum statistically significant effect, number of required replicates and error probabilities) which influence the design of the actual study. In particular, we are going to sketch how technical replicates can improve the performance of a test, when they are correctly used in the statistical analysis, e.g. with a hierarchical model. Finally, we demonstrate the process of evaluating the trade-off between different experimental designs with different replication strategies. The choice of an experimental design beyond the gut feeling can be influenced by factors such as costs, sample availability and the accuracy of of the tests. We use metabolite profiles of the model plant Arabidopsis thaliana measured on an UPLC-ESI/QqTOF-MS as real-world dataset, but the approach is equally applicable to other sample types and measurement methods like NMR based metabolomics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Modeling Microbial Community Networks: Methods and Tools for Studying Microbial Interactions

Article Open access 08 April 2024

Shanchana Srinivasan, Apoorva Jnana & Thokur Sreepathy Murali

Current trends, limitations and future research in the fungi?

Article Open access 20 March 2024

Kevin D. Hyde, Petr Baldrian, … Arttapon Walker

A practical guide to amplicon and metagenomic analysis of microbiome data

Article Open access 11 May 2020

Yong-Xin Liu, Yuan Qin, … Yang Bai

Notes

References

Ahrens, Heinz. (1967). Varianzanalyse. Berlin: Akademieverlag WTB.
Google Scholar
Baldi, P., & Long, A. D. (2001). A Bayesian framework for the analysis of microarray expression data: Regularized t test and statistical inferences of gene changes. Bioinformatics, 17(6), 509–519.
Article CAS PubMed Google Scholar
Böttcher, C., von Roepenack-Lahaye, E., & Scheel, D. (2011) Genetics and genomics of the Brassicaceae, crops and models ( Vol XII). In: Resources for metabolomics (p. 677). New York: Springer
Böttcher, C., Westphal, L., Schmotz, C., Prade, E., Scheel, D., & Glawischnig, E. (2009). The multifunctional enzyme CYP71B15 (PHYTOALEXIN DEFICIENT3) converts cysteine-indole-3-acetonitrile to camalexin in the indole-3-acetonitrile metabolic network of Arabidopsis thaliana. The Plant Cell Online, 21(6), 1830–1845.
Article Google Scholar
Broadhurst, D. I., & Kell, D. B. (2006). Statistical strategies for avoiding fals discoveries in metabolomics and related experiments. Metabolomics, 2(2):171–196.
Danielsson, A. P. H., Moritz, T., Mulder, H., & Spegel, P. (2012). Development of a gas chromatography/mass spectrometry based metabolomics protocol by means of statistical experimental design. Metabolomics, 8, 50–63.
Article CAS Google Scholar
Davis, C. (2002). Statistical methods for the analysis of repeated measurements. New York: Springer.
Google Scholar
Donner, A. (1996). Statistical considerations in the design and analysis of community intervention trials. Journal of Clinical Epidemiology, 49(4), 435–439.
Article CAS PubMed Google Scholar
Dreyhaupt, Jens., Sufeida, Sabrina., & Muche, Rainer. Power- und Fallzahlabschätzungen für hierarchische und longitudinale Studien. In 17. Konferenz der SAS-Anwender in Forschung und Entwicklung. KSFE e.V., 03 (2013).
Dunn, W. B. (2008). Current trends and future requirements for the mass spectrometric investigation of microbial, mammalian and plant metabolomes. Physical Biology, 5(1), 011001. (24pp).
Article PubMed Google Scholar
Dunn, W., Erban, A., Weber, R., Creek, D., Brown, M., Breitling, R., et al. (2013). Mass appeal: Metabolite identification in mass spectrometry-focused untargeted metabolomics. Metabolomics, 9, 44–66. doi:10.1007/s11306-012-0434-4.
Article CAS Google Scholar
Eliasson, M., Rännar, S., Madsen, R., Donten, M. A., Marsden-Edwards, E., Moritz, T., et al. (2012). Strategy for optimizing LC-MS data processing in metabolomics: A design of experiments approach. Analytical Chemistry, 84(15), 6869–6876.
Article CAS PubMed Google Scholar
Goodacre, R., Broadhurst, D., Smilde, A. K., Kristal, B. S., Baker, J. D., Beger, R., et al. (2007). Proposed minimum reporting standards for data analysis in metabolomics. Metabolomics, 3(3), 231–241.
Article CAS Google Scholar
Haug, K., Salek, R. M., Conesa, P., Hastings, J., de Matos, P., Rijnbeek, M., et al. (2013). MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data. Nucleic Acids Research, 41(Database issue), D781–D786.
Article CAS PubMed Central PubMed Google Scholar
Hendriks, M. M. W. B., van Eeuwijk, F. A., Jellema, R. H., Westerhuis, J. A., Reijmers, T. H., Hoefsloot, H. C. J., et al. (2011). Data-processing strategies for metabolomics studies. Trends in Analytical Chemistry, 30(10), 1685–1698.
Article CAS Google Scholar
Holmes, T. H. (2004). Ten categories of statistical errors: A guide for research in endocrinology and metabolism. American Journal of Physiology–Endocrinology and Metabolism, 286(4), E495–E501.
Article CAS PubMed Google Scholar
Horgan, G. W. (2007). Sample size and replication in 2D gel electrophoresis studies. Journal of Proteome Research, 6(7), 2884–2887.
Article CAS PubMed Google Scholar
Johnson, H. E., Lloyd, A. J., Mur, L. A., Smith, A. R., & Causton, D. R. (2007). The application of MANOVA to analyse Arabidopsis thaliana metabolomic data from factorially designed experiments. Metabolomics, 3, 517–530.
Article CAS Google Scholar
Karp, N. A., Spencer, M., Lindsay, H., O’Dell, K., & Lilley, K. S. (2005). Impact of replicate types on proteomic expression analysis. Journal of Proteome Research, 4(5), 1867–1871.
Article CAS PubMed Google Scholar
Lönnstedt, I., & Speed, T. (2001). Replicated microarray data. Statistica Sinica, 12, 31–46.
Google Scholar
Pavlidis, P., Li, Q., & Stafford, N. W. (2003). The effect of replication on gene expression microarray experiments. Bioinformatics, 19(13), 1620–1627.
Article CAS PubMed Google Scholar
Pinheiro, J. C., & Bates, D. (2014). Mixed-effects models in S and S-PLUS. New York: Springer.
Google Scholar
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods. Thousand Oaks: SAGE.
Google Scholar
Saccenti, E., Hoefsloot, H. C., Smilde, A. K., Westerhuis, J. A., & Hendriks, M. M. (2013). Reflections on univariate and multivariate analysis of metabolomics data. Metabolomics, 1–14.
Sampson, J. N., Boca, S. M., Shu, X. O., Stolzenberg-Solomon, R. Z., Matthews, C. E., Hsing, A. W., et al. (2013). Metabolomics in epidemiology: Sources of variability in metabolite measurements and implications. Cancer Epidemiology Biomarkers & Prevention, 22(4), 631–640.
Article CAS Google Scholar
Smith, C. A., Want, E. J., O’Maille, G., Abagyan, R., & Siuzdak, G. (2006). XCMS: Processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching and identification. Analytical Chemistry, 78(3), 779–787.
Article CAS PubMed Google Scholar
Snijders, T. A. B. (2001). Sampling, Chapter 11. In A. Leyland & H. Goldstein (Eds.), Multilevel modelling of health statistics (pp. 159–174). Longford: Wiley.
Google Scholar
Snijders, Tom A. B., & Snijders, T. A. (2005). Power and sample size in multilevel linear models. Encyclopedia of Statistics in Behavioral Science, 3, 1570–1573.
Google Scholar
Student, (1908). The probable error of a mean. Biometrika, 6, 1–25.
Article Google Scholar
Tutz, G., Fahrmeir, L., & Hamerle, A. (1996). Multivariate statistische verfahren. Berlin: Walter de Gryuter.
Google Scholar
Vinaixa, M., Samino, S., Saez, I., Duran, J., Guinovart, J. J., & Yanes, O. (2012). A guideline to univariate statistical analysis for LC/MS-based untargeted metabolomics-derived data. Metabolites, 2(4), 775–795.
Article CAS PubMed Central PubMed Google Scholar
von Roepenack-Lahaye, E., Degenkolb, T., Zerjeski, M., Franz, M., Roth, U., Wessjohann, L., et al. (2004). Profiling of Arabidopsis secondary metabolites by capillary liquid chromatography coupled to electrospray ionization quadrupole time-of-flight mass spectrometry. Plant Physiology, 134(2), 548–559.
Article CAS PubMed Central PubMed Google Scholar

Download references

Conflict of interest

The authors declare that they have no conflict of interest.

Compliance with ethical requirements

This article does not contain any studies with human or animal subjects.

Author information

Authors and Affiliations

Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120, Halle, Germany
Diana Trutschel, Stephan Schmidt & Steffen Neumann
Institute of Computer Science, Martin-Luther-University Halle-Wittenberg, Von-Seckendorff-Platz 1, 06120, Halle, Germany
Ivo Grosse
German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
Ivo Grosse

Authors

Diana Trutschel
View author publications
You can also search for this author in PubMed Google Scholar
Stephan Schmidt
View author publications
You can also search for this author in PubMed Google Scholar
Ivo Grosse
View author publications
You can also search for this author in PubMed Google Scholar
Steffen Neumann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Diana Trutschel.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 318 KB)

Supplementary material 1 (R 13 KB)

Supplementary material 1 (PDF 251 KB)

Supplementary material 1 (RNW 18 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Trutschel, D., Schmidt, S., Grosse, I. et al. Experiment design beyond gut feeling: statistical tests and power to detect differential metabolites in mass spectrometry data. Metabolomics 11, 851–860 (2015). https://doi.org/10.1007/s11306-014-0742-y

Download citation

Received: 29 April 2014
Accepted: 17 October 2014
Published: 02 November 2014
Issue Date: August 2015
DOI: https://doi.org/10.1007/s11306-014-0742-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Experiment design beyond gut feeling: statistical tests and power to detect differential metabolites in mass spectrometry data

Abstract

Access this article

Similar content being viewed by others

Modeling Microbial Community Networks: Methods and Tools for Studying Microbial Interactions

Current trends, limitations and future research in the fungi?

A practical guide to amplicon and metagenomic analysis of microbiome data

Notes

References

Conflict of interest

Compliance with ethical requirements

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (PDF 318 KB)

Supplementary material 1 (R 13 KB)

Supplementary material 1 (PDF 251 KB)

Supplementary material 1 (RNW 18 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Experiment design beyond gut feeling: statistical tests and power to detect differential metabolites in mass spectrometry data

Abstract

Access this article

Similar content being viewed by others

Modeling Microbial Community Networks: Methods and Tools for Studying Microbial Interactions

Current trends, limitations and future research in the fungi?

A practical guide to amplicon and metagenomic analysis of microbiome data

Notes

References

Conflict of interest

Compliance with ethical requirements

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (PDF 318 KB)

Supplementary material 1 (R 13 KB)

Supplementary material 1 (PDF 251 KB)

Supplementary material 1 (RNW 18 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation