Skip to main content

Statistics in Experimental Design, Preprocessing, and Analysis of Proteomics Data

  • Protocol
  • First Online:
Data Mining in Proteomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 696))

Abstract

High-throughput experiments in proteomics, such as 2-dimensional gel electrophoresis (2-DE) and mass spectrometry (MS), yield usually high-dimensional data sets of expression values for hundreds or thousands of proteins which are, however, observed on only a relatively small number of biological samples. Statistical methods for the planning and analysis of experiments are important to avoid false conclusions and to receive tenable results. In this chapter, the most frequent experimental designs for proteomics experiments are illustrated. In particular, focus is put on studies for the detection of differentially regulated proteins. Furthermore, issues of sample size planning, statistical analysis of expression levels as well as methods for data preprocessing are covered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Patterson SD (2003) Data analysis – the Achilles heel of proteomics. Nat Biotechnol 21:221–222

    Article  CAS  PubMed  Google Scholar 

  2. Karp NA, McCormick PS, Russell MR, Lilley KS (2007) Experimental and statistical considerations to avoid false conclusions in proteomic studies using differential in-gel electrophoresis. Mol Cell Proteomics 6:1354–1364

    Article  CAS  PubMed  Google Scholar 

  3. Fodor IK, Nelson DO, Alegria-Hartman M, Robbins K, Langlois RG, Turteltaub KW et al (2005) Statistical challenges in analysis of two-dimensional difference gel electrophoresis experiments using DeCyder. Bioinformatics 21:3733–3740

    Article  CAS  PubMed  Google Scholar 

  4. Ünlü M, Morgan ME, Minden JS (1997) Difference gel electrophoresis: a single gel method for detecting changes in protein extracts. Electrophoresis 18:2071–2077

    Article  PubMed  Google Scholar 

  5. Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R (1999) Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat Biotechnol 17:994–999

    Article  CAS  PubMed  Google Scholar 

  6. Ross PL, Huang YN, Marchese JN et al (2004) Multiplexed protein quantitation in Saccharomyces cerevisiae using aminereactive isobaric tagging reagents. Mol Cell Proteo-mics 3:1154–1169

    Article  CAS  PubMed  Google Scholar 

  7. Stühler K, Pfeiffer K, Joppich C, Stephan C, Jung K, Müller M et al (2006) Pilot study of the Human Proteome Organisation Brain Proteome Project: Applying different 2-DE techniques to monitor proteomic changes during murine brain development. Proteomics 6:4899–4913

    Article  PubMed  Google Scholar 

  8. Sitek B, Apostolov O, Stühler K, Pfeiffer K, Meyer HE, Eggert A, Schramm A (2005) Identification of dynamic proteome changes upon ligand activation of trk-receptors using two-dimensional fluorescence difference gel electrophoresis and mass spectrometry. Mol Cell Proteomics 4:291–9

    Article  CAS  PubMed  Google Scholar 

  9. Cairns DA, Barrett JH, Billingham LJ, Stanley AJ, Xinarianos G, Field JK et al (2009) Sample size determination in clinical proteomic profiling experiments using mass spectrometry for class comparison. Proteo-mics 9:74–86

    Article  CAS  PubMed  Google Scholar 

  10. Boehm AM, Pütz S, Altenhöfer D, Sickmann A, Falk M (2007) Precise protein quantification based on peptide quantification using iTRAQ™. BMC Bioinform 8:214

    Article  Google Scholar 

  11. Jeffries N (2005) Algorithms for alignment of mass spectrometry proteomic data. Bioinformatics 21:3066–3073

    Article  CAS  PubMed  Google Scholar 

  12. Bolstad BM, Irizarry RA, Astrand M, Speed TP (2003) A comparison of normalization methods for high density Oligonucleotide array data based on bias and variance. Bioinformatics 19:185–193

    Article  CAS  PubMed  Google Scholar 

  13. Huber W, Heydebreck A, von Sültmann H, Poustka A, Vingron M (2002) Variance stabilization applied to microarray data calibration and the quantification of differential expression. Bioinformatics 18:S96–S104

    PubMed  Google Scholar 

  14. Kreil DP, Karp NA, Lilley KS (2004) DNA microarray normalization methods can remove bias from differential protein expression analysis of 2D difference gel electrophoresis results. Bioinformatics 20:2026–2040

    Article  CAS  PubMed  Google Scholar 

  15. Jung K, Gannoun A, Sitek B, Meyer HE, Stühler K, Urfer W (2005) Analysis of dynamic protein expression data. RevStat-Stat J 3:99–111

    Google Scholar 

  16. Jung K, Gannoun A, Sitek B, Apostolov O, Schramm A, Meyer HE et al (2006) Statistical evaluation of methods for the analysis of dynamic protein expression data from a tumor study. RevStat-Stat J 4:67–80

    Google Scholar 

  17. Schafer JL, Graham JW (2002) Missing data: our view of the state of the art. Psychol Meth 7:147–177

    Article  Google Scholar 

  18. Dudoit S, Shaffer JP, Boldrick JC (2003) Multiple hypothesis testing in microarray experiments. Stat Sci 18:71–103

    Article  Google Scholar 

  19. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc Ser B 57:289–300

    Google Scholar 

  20. Benjamini Y, Yekutelli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29:1165–1188

    Article  Google Scholar 

  21. Jung K, Poschmann G, Podwojski K, Eisenacher M, Kohl M, Pfeiffer K et al (2009) Adjusted confidence intervals for the expression change of proteins observed in 2-dimensional difference gel electrophoresis. J Proteomics Bioinform 2:78–87

    Article  CAS  Google Scholar 

  22. Diggle PJ, Liang K-Y, Zeger SL (1994) Analysis of longitudinal data. Clarendon Press, Oxford

    Google Scholar 

  23. Brunner E, Domhof S, Langer F (2002) Nonparametric analysis of longitudinal data in factorial experiments. Wiley, New York

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Jung, K. (2011). Statistics in Experimental Design, Preprocessing, and Analysis of Proteomics Data. In: Hamacher, M., Eisenacher, M., Stephan, C. (eds) Data Mining in Proteomics. Methods in Molecular Biology, vol 696. Humana Press. https://doi.org/10.1007/978-1-60761-987-1_16

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-987-1_16

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60761-986-4

  • Online ISBN: 978-1-60761-987-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics