Proteomics pp 325-337 | Cite as

Statistical Assessment of QC Metrics on Raw LC-MS/MS Data

  • Xia WangEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1550)


Data quality assessment is important for reproducibility of proteomics experiments and reusability of proteomics data. We describe a set of statistical tools to routinely visualize and examine the quality control (QC) metrics obtained for raw LC-MS/MS data on different instrument types and mass spectrometers. The QC metrics used here are the identification free QuaMeter metrics. Statistical assessments introduced include (a) principal component analysis, (b) dissimilarity measures, (c) T2-chart for quality control, and (d) change point analysis. We demonstrate the workflow by a step-by-step assessment of a subset of Study 5 for the Clinical Proteomics Technology Assessment for Cancer (CPTAC) using our R functions.

Key words

Abnormal experiment Change point Dissimilarity Euclidean distance MS/MS proteomics Principal component analysis Quality control 


  1. 1.
    Bell AW, Deutsch EW, Au CE et al (2009) A HUPO test sample study reveals common problems in mass spectrometry-based proteomics. Nat Methods 6:423–430CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Mann M (2009) Comparative analysis to guide quality improvements in proteomics. Nat Methods 6:717–719CrossRefPubMedGoogle Scholar
  3. 3.
    Ma ZQ, Polzin KO, Dasari S (2012) QuaMeter: multivendor performance metrics for LC–MS/MS proteomics instrumentation. Anal Chem 84(14):5845–5850CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Chambers MC, Maclean B, Burke R et al (2012) A cross-platform toolkit for mass spectrometry and proteomics. Nat Biotechnol 30:918–920CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Wang X, Chambers MC, Vega-Montoto LJ et al (2014) QC metrics from CPTA raw LC-MS/MS data interpreted through multivariate statistics. Anal Chem 86(5):2497–2509CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    R Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
  7. 7.
    Rousseeuw PJ, van Zomeren BC (1990) Unmasking multivariate outliers and leverage points. J Am Stat Assoc 85:633–639CrossRefGoogle Scholar
  8. 8.
    Ringnér M (2008) What is principal component analysis? Nat Biotechnol 26(3):303–304CrossRefPubMedGoogle Scholar
  9. 9.
    Johnson RA, Wichern DW (2007) Applied multivariate statistical analysis. Pearson Prentice Hall, Upper Saddle RiverGoogle Scholar
  10. 10.
    Killick R, Eckley IA (2014) changepoint: an R package for changepoint analysis. J Stat Softw 58(3):1–19CrossRefGoogle Scholar
  11. 11.
    Tabb DL, Vega-Montoto L, Rudnick PA et al (2010) Repeatability and reproducibility in proteomic identifications by liquid chromatography-tandem mass spectrometry. J Proteome Res 9:761–776CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  1. 1.Department of Mathematical SciencesUniversity of CincinnatiCincinnatiUSA

Personalised recommendations