Abstract
In the analysis of data acquired from label-free experiments by liquid chromatography coupled with mass spectrometry (LC-MS), accounting for potential sources of variability can improve the detection of true differences in ion abundance. Mixed effects models are commonly used to estimate variabilities due to heterogeneity of the biological specimen, differences in sample preparation, and instrument variation. In this chapter, we investigate the mixed effects models and evaluate their performance in difference detection, in comparison to other methods such as marginal t-test, which uses the average over analytical and technical replicates within each biological sample for statistical analysis. Experimental design including replication assignment and sample size calculation is discussed. These are highly dependent on the variation contributed by the different sources, which can be estimated from LC-MS pilot studies prior to running large-scale label-free experiments.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of Royal Statistical Society: Series B (Methodological), 57, 289–300.
Clough, T., Key, M., Ott, I., Ragg, S., Schadow, G., & Vitek, O. (2009). Protein quantification in label-free LC-MS experiments. Journal of Proteome Research, 8, 5275–5284.
Clough, T., Thaminy, S., Ragg, S., Aebersold, R., & Vitek, O. (2012). Statistical protein quantification and significance analysis in label-free LC-MS experiments with complex designs. BMC Bioinformatics, 13(Suppl 16), S6.
Cui, Q., Lewis, I. A., Hegeman, A. D., Anerson, M. E., Li, J., Schulte, C., et al. (2008). Metabolite identification via the Madison Metabolomics Consortium Database. Nature Biotechnology, 26, 162–164.
Datta, S., & Glen, A. S. (2005). Rank-sum tests for clustered data. Journal of the American Statistical Association, 100, 908–915.
Dutta, S., & Datta, S. (2016). A rank-sum test for clustered data when the number of subjects in a group within a cluster is informative. Biometrics, 72(2), 432–440.
Fahy, E., Sud, M., Cotter, D., & Subramaniam, S. (2007). LIPID MAPS online tools for lipid search. Nucleic Acids Research, 35, W606–W612.
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348–1360.
Karpievitch, Y. V., Polpitiya, A. D., Anderson G. A., Smith, R. D., & Dabney, A. R. (2010). Liquid chromatography mass spectrometry-based proteomics: Biological and technological aspects. The Annals of Applied Statistics, 4, 1797–1823.
Karpievitch, Y. V., Stanley, J., Taverner, T., Huang, J., Adkins, J. N., Ansong, C., et al. (2009). A statistical framework for protein quantitation in bottom-up MS-based proteomics. Bioinformatics, 25, 2028–2034.
Nilsson, T., Mann, M., Aebersold, R., Yates III, J. R., Bairoch, A., & Bergeron, J. J. (2010). Mass spectrometry in high-throughput proteomics: Ready for the big time. Nature Methods, 7, 681–685.
Oberg, A. L., & Vitek, O. (2009). Statistical design of quantitative mass spectrometry-based proteomic experiments. Journal of Proteome Research, 8, 2144–2156.
Patel, V. J., Thalassinos, K., Slade, S. E., Connolly, J. B., Crombie, A., Murrell, J. C., et al. (2009). A comparison of labeling and label-free mass spectrometry-based proteomics approaches. Journal of Proteome Research, 8, 3752–3759.
Ressom, H. W., Xiao J. F., Tuli, L., Varghese, R. S., Zhou, B., Tsai, T., et al. (2012). Utilization of metabolomics to identify serum biomarkers for hepatocellular carcinoma in patients with liver cirrhosis. Analytica Chimica Acta, 743, 90–100.
Smith, C. A., O’Maille, G., Want, E. J., Qin, C., Trauger, S. A., Brandon, T. R., et al. (2005). METLIN: A metabolite mass spectral database. Therapeutic Drug Monitoring, 27, 747–751.
Smith, C. A., Want, E. J., O’Maille, G., Abagyan, R., & Siuzdak, G. (2006). XCMS: Processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Analytical Chemistry, 78, 779–787.
Tautenhahn, R., Bottcher, C., & Neumann, S. (2007). Annotation of LC/ESI-MS mass signals. In Proceedings of the First International Conference on Bioinformatics Research and Development (pp. 371–380).
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288.
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67, 91–108.
Tsai, T. H., Tadesse, M. G., Di Poto, C., Pannel, L. K., Mechref, Y., Wang, Y., et al. (2013). Multi-profile Bayesian alignment model for LC-MS data analysis with integration of internal standards. Bioinformatics, 29, 2274–2280.
Wainwright, M. (2009). Sharp thresholds for noisy and high-dimensional recovery of sparsity using ℓ 1-constrained quadratic programming (lasso). IEEE Transactions on Information Theory, 55, 2183–2202.
Whishart, D. S., Tzur, D., Knox, C., Eisner, R., Guo, A. C., Young, N., et al. (2007). HMDB: The human metabolome database. Nucleic Acids Research, 35, D521–D526.
Xiao, J. F., Varghese, R. S., Zhou, B., Ranjbar, M. R., Zhao, Y., Tsai, T. H., et al. (2012). LC-MS based serum metabolomics for identification of hepatocellular carcinoma biomarkers in Egyptian cohort. Journal of Proteome Research, 11, 5914–5923.
Xiao, J. F., Zhao, Y., Varghese, R. S., Zhou, B., Di Poto, C., Zhang, L., et al. (2014). Evaluation of metabolite biomarkers for hepatocellular carcinoma through stratified analysis by gender, race and alcoholic cirrhosis. Cancer Epidemiology, Biomarkers & Prevention, 23, 64–72.
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894–942.
Zhao, P., & Yu, B. (2006). On model selection consistency of lasso. The Journal of Machine Learning Research, 7, 2541–2563.
Zhou, B., Wang, J., & Ressom, H. W. (2012). MataboSearch: Tool for mass-based metabolite identification using multiple databases. PLoS One, 7, e40096.
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101, 1418–1429.
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Methodological), 67, 301–320.
Acknowledgements
This work is in part supported by the National Institutes of Health Grants U01CA185188 and R01GM086746 awarded to HWR.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Zhao, Y., Tsai, TH., Di Poto, C., Pannell, L.K., Tadesse, M.G., Ressom, H.W. (2017). Variability Assessment of Label-Free LC-MS Experiments for Difference Detection. In: Datta, S., Mertens, B. (eds) Statistical Analysis of Proteomics, Metabolomics, and Lipidomics Data Using Mass Spectrometry. Frontiers in Probability and the Statistical Sciences. Springer, Cham. https://doi.org/10.1007/978-3-319-45809-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-45809-0_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45807-6
Online ISBN: 978-3-319-45809-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)