Manual of Cardiovascular Proteomics pp 275-292 | Cite as
Analysis of Proteomic Data
Abstract
Whether you are a proteomics specialist or simply an end-user of proteomic data, the day will come when you sit down with your dataset, typically a list of proteins or protein clusters whose abundance change in one or more experimental groups. This protein change is often represented as a ratio or fold-change. When the euphoria wears off, the nagging questions set in. How accurate are your data, really? How confident are you in these changes; are they statistically significant? If so, by what statistical test? Are you sure the test is suitable for your data? How would you know? Or perhaps more importantly, as a graduate student, would you spend the next year following up on a proteomic lead? As principal investigator, should you reallocate substantial resources to a new line of enquiry? Given the risk of squandering time and money on false leads or dismissing a nugget that could change existing paradigms, delving more deeply into the principles of robust proteomic analysis, however daunting at first blush, is a good investment.
Keywords
Experimental design Protein quantitation Differential regulation Statistical inference Empirical Bayes Significance Multiple comparison correction RobustnessReferences
- 1.Ross PL, Huang YN, Marchese JN, Williamson B, Parker K, Hattan S, Khainovski N, Pillai S, Dey S, Daniels S, Purkayastha S, Juhasz P, Martin S, Bartlet-Jones M, He F, Jacobson A, Pappin DJ. Multiplexed protein quantitation in saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol Cell Proteomics. 2004;3(12):1154–69.Google Scholar
- 2.Thompson A, Schäfer J, Kuhn K, Kienle S, Schwarz J, Schmidt G, Neumann T, Hamon C. Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal Chem. 2003;75(8):1895–904, 2003.CrossRefPubMedGoogle Scholar
- 3.Herbrich SM, Cole RN, West KP Jr, Schulze K, Yager JD, Groopman JD, Christian P, Wu L, O’Meally RN, May DH, McIntosh MW, Ruczinski I. Statistical inference from multiple itraq experiments without using common reference standards. J Proteome Res. 2013;12(2):594–604.Google Scholar
- 4.Oberg AL, Mahoney DW, Eckel-Passow JE, Malone CJ, Wolfinger RD, Hill EG, Cooper LT, Onuma OK, Spiro C, Therneau TM, Bergen HR. Statistical analysis of relative labeled mass spectrometry data from complex samples using ANOVA. J Proteome Res. 2008;7(1):225–33.Google Scholar
- 5.Hill EG, Schwacke JH, Comte-Walters S, Slate EH, Oberg AL, Eckel-Passow JE, Therneau TM, Schey KL. A statistical model for iTRAQ data analysis. J Proteome Res. 2008;7(8):3091–101.Google Scholar
- 6.Rice J. Mathematical statistics and data analysis. 2nd ed. Boston: Duxbury Press; 1995.Google Scholar
- 7.Smith GK. Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2004;3(1).Google Scholar
- 8.Orellana CA, Marcellin E, Schulz BL, Nouwens AS, Gray PP, Nielsen LK. High antibody producing Chinese hamster ovary cells upregulate intracellular protein transport and glutathione synthesis. J Proteome Res. 2015;14(2):609–18.Google Scholar
- 9.Kammers K, Cole RN, Tiengwe C, Ruczinski I. Detecting significant changes in protein abundance. EuPA Open Proteom. 2015;7:11–9.Google Scholar
- 10.Little RJA, Rubin DB. Statistical analysis with missing data. New York: John Wiley & Sons; 1987.Google Scholar
- 11.Wang P, Tang H, Zhang H, Whiteaker J, Paulovich AG, Mcintosh M. Normalization regarding non-random missing values in high-throughput mass spectrometry data. Pac Symp Biocomput. 2006;315–26.Google Scholar
- 12.Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B. 1995;57(1):289–300.Google Scholar
- 13.Storey JD, Tibishrani R. Statistical significance for genomewide studies. Proc Natl Acad Sci U S A. 2003;100(16):9440–5.CrossRefPubMedPubMedCentralGoogle Scholar
- 14.Kerr MK, Martin M, Churchill GA. Analysis of variance for gene expression microarray data. J Comput Biol. 2000;7(6):819–37.Google Scholar
- 15.Kerr MK, Churchill GA. Experimental design for gene expression microarrays. Biostatistics. 2001;2(2):183–201.CrossRefPubMedGoogle Scholar